#### Interscience Research Network ### Interscience Research Network Conference Proceedings - Full Volumes **IRNet Conference Proceedings** 5-20-2012 # International Conference on Electronics & Communication Engineering Prof.Srikanta Patnaik Mentor IRNet India, patnaik\_srikanta@yahoo.co.in Follow this and additional works at: https://www.interscience.in/conf\_proc\_volumes Part of the Biomedical Commons, Controls and Control Theory Commons, Electrical and Electronics Commons, Electromagnetics and Photonics Commons, Nanotechnology Fabrication Commons, Power and Energy Commons, Signal Processing Commons, Systems and Communications Commons, and the VLSI and Circuits, Embedded and Hardware Systems Commons #### **Recommended Citation** Patnaik, Prof.Srikanta Mentor, "International Conference on Electronics & Communication Engineering" (2012). *Conference Proceedings - Full Volumes*. 77. https://www.interscience.in/conf\_proc\_volumes/77 This Book is brought to you for free and open access by the IRNet Conference Proceedings at Interscience Research Network. It has been accepted for inclusion in Conference Proceedings - Full Volumes by an authorized administrator of Interscience Research Network. For more information, please contact sritampatnaik@gmail.com. # Proceedings of International Conference on ELECTRONICS & COMMUNICATION ENGINEERING (ICECE-2012) 20<sup>th</sup> May, 2012 BANGALORE, India Interscience Research Network (IRNet) Bhubaneswar, India ## **Editorial** Fast communication is the need of the hour for which society relies on Electronics &Telecommunication Engineering for breakthroughs in applications such as satellites, next generation mobile phones, air-traffic control, the Internet etc. In fact, all electronic devices need software interface to run and come with one or other device controlling programs architected and developed by electronics and communication Engineering. Thus, tremendous opportunities for research and development lies in the area of Electronics and Communication Engineering, as everyday consumer need new devices to support them in daily life. International Conference on Electronics and Communication Engineering (ICECE-2012) provides such unique platform for R&D works. The conference will conglomerate academicians, researchers from all types of institutions and organizations who would share their domain knowledge and healthy interaction would take place covering the areas like electronics and communications engineering, electric energy, automation, control and instrumentation, computer and information technology, and the electrical engineering aspects of building services and aerospace engineering, The wide scope encompasses analogue and digital circuit design, microwave circuits and systems, optoelectronic circuits, photo voltaic, semiconductor devices, sensor technology, transport in electronic materials, VLSI technology and device processing. We are happy to inform you that we had received an overwhelming response in the area. I must acknowledge your response to this conference. I ought to convey that this conference is only a little step towards knowledge and innovation but certainly in the right perspective. I wish all success to the paper presenters I extend heart full thanks to members of faculty from different institutions, research scholars, delegates, IRNet Family members, members of the technical and organizing committee. Above all I note the salutation towards the almighty. Editor-in-Chief Dr. K. Karibasappa Prof & Head, Dept. of Electronics & Communication Engineering Shavige Malleswara Hills Dayananda Sagar College of Engineering Kumaraswamy Layout BANGALORE -560078 # A GALS Chip Multiprocessor Architecture for Multi-Link Low-Area Interconnection #### Haripriya. R, C.B.Vinutha & M.Z.Kurian SSIT, Tumkur, Karnataka, India; Email: priyakushi18@gmail.com, cbvinutha@gmail.com, mzkurianvc@yahoo.com Abstract --- Integrating multiple processors into a single chip (known as chip multiprocessors or CMPs) has recently become easily achievable and common due to continuing advances in VLSI fabrication technologies A new inter-processor communication architecture for chip multiprocessors is proposed which has a low area cost, flexible routing capability, and supports globally asynchronous locally synchronous (GALS) clocking styles. A presented implementation example of the proposed architecture shows that it can reduce the communication circuitry area by approximately two times with similar routing capability. To achieve a low area cost, the proposed statically-configurable asymmetric architecture assigns large buffer resources to only the nearest neighbor interconnect and much smaller buffer resources for long distance interconnect. To maintain flexible routing capability, each neighboring processor pair has multiple connecting links. The architecture supports long distance communication in GALS systems by transferring the source clock with the data signals along the entire path for write synchronization. Compared to a traditional dynamically-configurable interconnect architecture with symmetric buffer allocation and single-links between neighboring processor pairs, this implementation has approximately two times smaller communication circuitry area with a similar routing capability. **Key words:** Chip multiprocessor, globally asynchronous locally synchronous (GALS), inter-processor interconnect, many-core, multi-core, network-on-chip (NoC). #### I. INTRODUCTION A number of processors integrated on a singles chip performing various operations simultaneously is called System on chip(SOC). Wires in deep-sub micrometer CMOS fabrication technologies are introducing greater relative delay, relative power consumption, and timing and power variations which is causing traditional onchip communication methods such as a global bus structures to meet considerable challenges. Researchers have proposed network-on-chip (NoC) solutions which use routers for inter-processor communication. Most research is based on dynamic packet-switched routing architectures. Another approach is the statically configurable nearest-neighbor interconnect architecture where each processor communicates with only its four nearest neighbors in 2-D meshes and long distance communication is accomplished by software in intermediate processors. Other designs use both dynamic and static interconnects. Fig.1.a. Illustration of interprocessor communication in a 2-D mesh Although both dynamic routing architectures and static nearest neighbor interconnect architectures achieve significant success in specific areas, they have some limitations. Dynamic routing architectures are flexible, but normally require relatively large circuit area and power for communication circuitry. The static nearest neighbor interconnect architecture reduces area and power requirements significantly, but it results in relatively high latency for long distance communication. Fig.1.b. A generalized communication routing architecture in which only signals related to the west edge are drawn. Communications within chip multiprocessors for many applications, especially many digital signal processing (DSP) algorithms, are often largely localized most communication is among nearest (or local) neighbors while a small portion is long distance. Motivated by this fact, we propose an *asymmetric* structure to obtain good tradeoffs between flexibility and cost by treating the nearest neighbor communication and long distance communication differently, using more buffer resources for nearest neighbor connections, and using fewer buffer resources for long distance connections. Together with the relatively simple static routing approach, this asymmetric architecture can achieve low area cost for communication circuitry. Fig.1.c. Illustration of interprocessor communication in a static neighbor interconnect architecture Fig. 1.d. Circuitry diagrams of the static nearest neighbor interconnect architecture Data from four inputs are transferred only to the processing core to reduce the circuitry cost, and only a single buffer is needed. Under the static asymmetric architecture, there are a couple of design options available such as the number of input ports (buffers) for the processing core and the number of links between each neighboring processor pair. The area, speed, and performance of different design options are analyzed, and some conclusions based on the results are drawn. We found that increasing the number of links between processors is helpful to increase routing capability, but it dramatically increases processor area after a certain point which depends on implementation details. Two or three links are generally appropriate when each processor in the chip utilizes a simple single-issue processor architecture. Moreover, the proposed architecture supports the globally asynchronous locally synchronous (GALS) clocking style which allows each processor to operate in its own clock domain and avoids the design of a global clock tree, which can significantly simplify the clock system design and potentially reduce system power consumption. After examining the characteristics of different approaches, we propose a source synchronous method which transfers the clock with the data and control signals along the entire path to the destination processor. Compared to traditional dynamically configurable interconnect architectures with symmetric buffer allocation and single links between each neighboring processor pair, a presented implementation example of the proposed architecture shows that it can reduce the communication circuitry area by approximately two times with similar routing capability. #### A. Static Routing Versus Dynamic Routing The inter-processor interconnect can be configured statically before runtime (static routing), or dynamically at runtime (dynamic routing). Dynamically-routed networks have been commonly used in multiprocessor systems such as those utilizing message passing methods. Moreover, dynamic networks have been rigorously studied in NoC research, but staticallyconfigured architectures have been much less intensively studied. The key advantage of the static configuration approach is that for applications with predictable traffic, such as most DSP applications, it can provide an efficient solution with small area cost and communication latency. The dynamic configuration solution can effectively address more applications because of its flexibility, but it has non-negligible overhead in terms of the circuitry area and the communication latency; the main overhead comes from the routing path definition, the arbiter of multiple independent clock sources, and the signal recognition at the destination processor. #### B. Dynamic Routing and Its Overhead: In dynamic routing, the data transfer path should be defined by the source processor and propagated to the corresponding downstream processor(s) or dynamically decided by intermediate processors. The circuitry to define and control the routing path has an area overhead, and to propagate the routing path might cost extra instructions and increase the clock cycles for the data transfer. Since each link in the dynamic routing architecture is shared by multiple sources, an arbiter is required to allow only one source to access the link at one time. Furthermore, in GALS chip multiprocessors, this arbiter becomes more complex since it must handle multiple sources with unrelated clock domains. An obvious overhead is that some synchronization circuitry is required for the arbiter to receive the link-occupying request from different sources, and some logic is required to avoid glitches when the occupying path changes. Another important issue is how the destination processor can identify the source processors of the received data. Since data can travel through multiple processors with unknown clock domains, it is not possible to assume a particular order for the incoming data. One common method is that an address is assigned to each processor and sent along with the data, and the destination processor uses the address to identify the source processor through software or hardware. Combining these overheads, the communication latency for dynamic routing between adjacent processors has been estimated to be typically larger than 20 clock cycles, and this value will increase further for GALS dynamic routing networks due to the additional synchronization latency. #### C. Static Routing Due to its smaller circuit area and excellent compatibility with GALS-clocked systems, we investigate only the static routing approach in this paper. Few multi-processor systems use static routing, and the Systolic approach is one of the pioneers. Systolic systems contain synchronously-operating processors which "pump" data regularly through a processor array, and the data to be processed must reach the processing unit at the exact predefined time. Due to this strict requirement for data streams, the systolic architecture is well suited only for applications with highly regular communication patterns such as matrix multiplication. Releasing the strict timing requirement of the data stream can significantly broaden the application domain. To release the systolic system's strict cycle-by-cycle timing requirements, each processor must "wait" for data when the data is late, and the data must "wait" to be processed when it comes early. Inserting a first-inputfirst-output (FIFO) with appropriate full and empty logic at each input of the processing core can meet these requirements. Data is buffered in the FIFO when it comes early, the downstream processor is stalled when the FIFO is empty and there is a read request, and the upstream processor is stalled when the FIFO is full and there is a write request. In this way, the requirement for the data stream is only its order, not its exact arrival time. RAW is a chip multiprocessor with very low latency for interprocessor communication (three clock cycles) using both static routing and dynamic routing, but it achieves this goal with a large area cost of about 4 mm in a $0.18\mu m$ CMOS technology. The communication circuitry we propose is suitable for broad applications, with low latency (about five clock cycles), and low area overhead (about 0.1 mm in $0.18\mu m$ technology). #### II. ARCHITECTURE DESIGN Consider an example of a nine-processor JPEG encoder. Table shows the data traffic of each processor for a nine-processor JPEG encoder as shown in Fig. 2.a, which demonstrates the different asymmetric data traffic on the input-buffered and output-buffered routers. Considering the router's input ports, although each processor shows a clear asymmetric communication data traffic load, the major input direction for different processors are different which makes the overall traffic at the input ports within a factor of five in this example—the relative input traffic for the east, north, west, and south directions are 9%, 26%, 22%, and 43%, respectively. Therefore, to optimize buffers in this approach would require the customization of individual buffer sizes on each processor which would then unfortunately optimize the design for only one (or a small number) of applications. On the other hand, considering the *output* ports, each processor shows a similar asymmetric data traffic: most of the data from the input ports are delivered to the core (for local processing) and very little is delivered to the edges (for long distance communication), and overall about 80% of the data are delivered to the core. Thus a single asymmetric output-buffered router can be widely suitable for different applications, which is important since multi-core chips utilizing NoC architectures are typically used widely across a number of application domains. Fig. 2.a. Nine-processor implementation of a JPEG encoder core #### A. Low-area interconnect architecture The proposed statically configurable low-area Asymmetrically-Buffered interconnect architecture. Fig. 2.b. Block diagram of the proposed Router | | Network data words of Input ports of router | | | | | |----------|----------------------------------------------------------------------------|-----|------|-----|-------| | | East | Nor | th W | est | South | | Relative | 9% | 26% | 22 | % | 43% | | | Network data words of output ports of router<br>Core East North West South | | | | | | | l _ | _ | | • | | Fig. 2.c. Circuit diagram of the proposed interprocessor communication architecture It has the asymmetric buffer resource for the long distance interconnect and the local core interconnect. Asymmetric Data Traffic Typically Exists at the Router's Output Ports. The case of varying buffer allocation for input buffered routers to match asymmetric inter-processor data traffic loads has been shown to achieve some benefits. In contrast, this paper presents asymmetric buffer allocation for output buffered routers because we find the asymmetric data traffic on the router's outputs and are more uniform across different applications and hence the architecture is helpful across a wider range of applications. The asymmetric traffic focuses on the differences in traffic going to the processor core versus output ports connected to other processors. #### B. Working of proposed Multiprocessor architecture Instead of equally distributing buffer resources to each output port, we allocate a relatively large buffer to the processing core port, and smaller buffers (one or several registers) to the other ports. Fig. 2.c shows the circuit diagram where only signals related to the west edge (west in and west out) are drawn. This architecture's circuit area is similar to the nearest neighbor interconnect architecture, shown in Fig. 2.c. Since it adds only a few registers and multiplexers. From the point of view of routing capability, this architecture is similar to the traditional dynamic routing architecture, shown in Fig. 1.b, since reducing the buffers in ports for long distance communication does not significantly affect system performance when the communication is localized. With its one large buffer for the processing core, the proposed architecture can save about five times the area compared to the traditional dynamic architecture shown in Fig. 1.b. #### III. RESULTS COMPARISON Different communication architectures, including the static nearest neighbor interconnect, the proposed double-link routing Fig. 3.a. . Performance Comparison #### A. Performance Comparison In this section, the analysis of performance of different implementations is done. # 1) Performance of the Basic Communication Patterns: Fig. 3.a shows the latency of the basic communication patterns mapped onto different architectures along with different array sizes. The proposed double-link routing architecture normally has significant savings in communication latency compared to the nearest neighbor architecture. The latency of the dynamic single-link routing architecture is similar to the static double-link architecture. The modeled communication is organized uniformly by the four basic communication patterns and we assume 80% of the communication is within the local area which is the value often used in the literature. The proposed static double-link routing architecture is more than 2 times faster, and the dynamic single link routing architecture is a little slower than the static double link architecture. Fig:3.b. Performance Comparison #### IV. CONCLUSION An asymmetric inter-processor communication architecture which assigns more buffer resources to the nearest neighbor interconnect and fewer buffer resources to the long distance interconnect is proposed. Static routing is emphasized due to its low cost and low communication latency. Compared to a traditional dynamically-configurable interconnect architecture with symmetric buffer allocation and single-links between neighboring processor pairs, this implementation has approximately two times smaller communication circuitry area with a similar routing capability. The proposed architecture also provides the ability to support long distance GALS communication with an extended source synchronous transfer method. A single node is considered and its working is simulated and also network with multiple nodes is simulated for static and dynamic routing techniques. Next the proposed architecture code should be simulated and synthesized. #### **HDL Synthesis Report** Macro Statistics of existing system # Registers : 19 1-bit register : 8 8-bit register : 11 # Latches : 1 8-bit latch : 1 Macro Statistics of Proposed system 8-bit Registers : 8 # Registers : 8 1-bit register :1 #### REFERENCE - [1] M. B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, W. Lee, A. Saraf, N. Shnidman, V. Strumpen, S. Amarasinghe, and A. Agarwal, "A 16-issue multiple-programcounter microprocessor with point-to-point scalar operand network," in *Proc. ISSCC*, Feb. 2003, pp. 170–171. - [2] S. W. Keckler, D. Burger, C. R. Moore, R. Nagarajan, K. Sankaralingam, V. Agarwal, M. S. Hrishikesh, N. Ranganathan, and P. Shivakumar, "A wire-delay scalable microprocessor architecture for high performance systems," in *Proc. ISSCC*, 2003, vol. 46, pp. 168–169. - [3] W. Dally and B. Towles, "Route packets, not wires: On-chip interconnection networks," in *Proc. IEEE Int. Conf. Des. Autom.*, Jun. 2001, pp. 684–689. - [4] S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar, "An 80-tile 1.28 TFLOPS networkon-chip in 65 nm CMOS," in *Proc. ISSCC*, Feb. 2007, pp. 98–99. - [5] B. Baas, Z. Yu, M. Meeuwsen, O. Sattari, R. Apperson, E. Work, J.Webb, M. Lai, T. Mohsenin, D. Truong, and J. Cheung, "AsAP: A fine-grain multi-core platform for DSP - applications," *IEEE Micro*, vol. 27, no. 2, pp. 34–45, Mar./Apr. 2007. - [6] Z. Yu, M. Meeuwsen, R. Apperson, O. Sattari, M. Lai, J. Webb, E. Work, D. Truong, T. Mohsenin, and B. Baas, "AsAP: An asynchronous array of simple processors," *IEEE J. Solid-State Circuits*, vol. 43, no. 3, pp. 695–705, Mar. 2008. - [7] D.Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C. Miao, J. F. Brown III, and A. Agarwal, "On-chip interconnection architecture of the tile processor," *IEEE Micro*, vol. 27, no. 5, pp. 15–31, Sep./Oct. 2007. - [8] H. Zhang, M. Wan, V. George, and J. Rabaey, "Interconnect architecture exploration for low-energy reconfigurable single-chip DSPs," in *Proc. IEEE Comput. Soc. Workshop VLSI*, Apr. 1999, pp. 2–8. - [9] P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh, "Effect of traffic localization on energy dissipation in NoC-based interconnect," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2005, pp. 1774–1777. # Performance Analysis of WSXC and WIXC SSM OXC in WDM Optical Networks #### Md. Ishtiaque Aziz Zahed & Md. Shah Afran Department of Electrical and Electronic Engineering Bangladesh University of Engineering and Technology (BUET), Dhaka-1000, Bangladesh. E-mail: ishtiaque2307@gmail.com, afran2902@gmail.com Abstract - The impact of inband crosstalk on an optical signal passing through optical cross-connect nodes (OXC's) in wavelength division multiplexing (WDM) optical network, is studied from the equation of electric field with crosstalk and the corresponding current. The analysis has been done for two SSM (space switching matrix) OXC architecture namely WSXC & WIXC where later one has full wavelength conversion capability. Although WIXC attenuates more crosstalk though it is found that depending on the values of optical propagation delay differences, coherent time of lasers and time duration of one bit of the signal, the required power penalty in WIXC may be greater than that of WSXC in some cases. The analysis has been performed on the measures of Bit Error Rate (BER) and Power Penalty. **Keywords**- Inband crosstalk, optical cross-connect, wavelength division multiplexing (WDM), wavelength selective cross-connect (WSXC), wavelength interchanging cross-connect (WIXC). #### I. INTRODUCTION In a wavelength division multiplexing (WDM) optical network, the optical cross-connect (OXC) at each node carries out wavelength sensitive switching in optical form without restoring to electro optical conversion. A number of OXC architectures have been proposed in [1] and [2], each of which has its own unique features, strengths and limitations. While cross-connecting wavelengths from input to output fibers OXC introduces inband and intraband crosstalk. The inband crosstalk which is also known as homodyne crosstalk has the same wavelength as the signal and degrades the transmission performance seriously. When an optical signal passes through an OXC, many crosstalk contributions are combined with the signal[3]-[5]. In this paper, the performance of wavelength selective cross-connect (WSXC), wavelength interchanging cross-connect (WIXC) is investigated and compared in the presence of inband (Homodyne) crosstalk which is caused by non-ideal performance of an optical node. #### II. INBAND CROSSTALK IN WSXC AND WIXC Inband crosstalk is a major problem in optical network. Figure. 1 A cascaded wavelength demultiplexer and multiplexer as a source of in-band crosstalk Figure. 2 An optical switch as a source of in-band crosstalk One source of this arises from cascading a demultiplexer with wavelength wavelength multiplexer as shown in Figure 1. The demux ideally separates the incoming wavelengths to different output fibers. In reality however a portion of the signal at one wavelength, say $\lambda_i$ , leaks into the adjacent channel $\lambda_{i+1}$ because of non ideal suppression within the demux. When the wavelengths are combined again into a single fiber by the mux, a small portion of the $\lambda_i$ , that leaked into the $\lambda_{i+1}$ channel, will also leak back into the common fiber at the output. Although both signals contain the same data, they are not in phase with each other, due to different delays encountered by them. This causes inband crosstalk. [6] Another source of this type of crosstalk arises from optical switches as shown in Figure 2, due to the non ideal isolation of one switch port from the other. In this case, the signal contains different data. The crosstalk penalty is highest when the crosstalk signal is exactly out of phase with the desired signal. Inband crosstalk can be divided into coherent crosstalk and incoherent crosstalk. When the phase of the crosstalk signal is correlated with that of the main signal, it is called coherent crosstalk. When the phase of the crosstalk signal is not correlated with that of the main signal, it is called incoherent crosstalk. Crosstalk signals generated from the same source are coherent crosstalk and crosstalk signals generated from different sources are incoherent crosstalk. Coherent crosstalk is believed not to cause noise but causes fluctuations of signal power. The lightpath, representing the optical layer connection between the source-destination node pairs, can be set up through the intermediate OXCs in either a wavelength- continuous (WC or VWP, virtual wavelength path) or non-wavelength-continuous (NWC or WP, wavelength path) fashion. In the WC case, the same wavelength is used over the entire lightpath whereas, in the NWC case, different wavelengths may be used in different optical links along the given path. Setting up the lightpath would not only involve selecting the route to be followed but also the wavelengths to be used along the selected route[6]. Wavelength conversions at the intermediate nodes is necessary if NWC (WP) lightpaths are to be supported. This, however, would require the OXCs to do wavelength conversion in addition to their switching functions. The OXCs may, in turn, be classified based on their wavelength conversion capability [6]. Among a number of proposed SSM OXC structures WSXC and WIXC are focused. An OXC without any conversion capability is called a wavelength selective cross-connect (WSXC) whereas an OXC with full conversion capability is referred to as a wavelength interchanging cross-connect (WIXC). Examples of these have been shown in the figure 3 and 4. Here, SSM refers to the space switching matrix, used to switch the optical signals without doing any wavelength conversion. The wavelength converters required have been shown separately. A typical structure of OXC is shown in figure 5, which consists a total of N optical demultiplexers, M optical switches and N multiplexers. Each of the input fibers to an optical demultiplexer contains M different wavelengths. Each of these passes through an optical switch before they are combined with the outputs from the other M-I optical switches. Assuming the OXC is fully loaded the OXC will be interfered by M+N-2 homodyne crosstalk contributions, N-I of which are leaked by the optical switch leaked by the (OXC) demultiplexer / multiplexer pair. [7] If we consider the signal with wavelength 1 in input fiber 1, noted as $\lambda_{11}$ or the main signal. $\lambda_{11}$ will be interfered by N-I crosstalk contributions leaked from the N-I signals with wavelength 1 in the other N-I input fibers. Similarly, when each signal with wavelength 1 is demultiplexed to one path, there will be a fraction of it in each of the other M-1 outputs of the corresponding demultiplexer because of the non-ideal Figure. 3 WSXC OXC architecture Figure. 4 WIXC OXC architecture Figure. 5 Typical structure of an Optical Cross Connect crosstalk specification of optical demultiplexers. These M-1 crosstalk contributions can be leaked from any signal with wavelength 1 in all the N input fibers. The number of contributions leaked from each signal is random, from 0 to M-I, depending on the cross connecting state of the OXC. Defining $X_1$ as the number of contributions leaked from $\lambda_{11}$ in a given state of the OXC, $$X_1 \in [0,M\text{-}1]$$ Defining, $X_j$ ( j=[2,N] ) as the number of contributions leaked from $\lambda_{j1}$ fin the same state of the OXC, taking into account the N-1 contributions leaked by the optical switch 1, we have $$X_j \in [1, M]$$ and $$X_1 + \sum_{j=2}^{N} X_j = M + N - 2$$ The field of the main signal and all the M+N-2 crosstalk contributions can be expressed as $$\vec{E}(t) = Eb_s(t)\cos[\omega_s t + \Phi_s(t)] \vec{P}_s$$ $$+ \sum_{i=1}^{X_1} \sqrt{\varepsilon} E b_s(t - \tau_i) \cos[\omega_s (t - \tau_i) + \Phi_s(t - \tau_i)] \vec{P}_t$$ $$+ \sum_{j=2}^{N} \sum_{k=1}^{X_j} \sqrt{\varepsilon} E b_j (t - \tau_{jk}) \cos[\omega_j (t - \tau_{jk}) + \Phi_j (t - \tau_{jk})] \vec{P}_{jk}$$ (1) Where E is the signal field amplitude which is assumed to be unchanged as the leaked power is rather low; bs(t) and bj(t) (j=[2,N]) are the binary data sequences with values of 0 or 1 in a bit period T of $\lambda_{11}$ and $\lambda_{j1}$ , respectively, $\omega_{\rm s}(t), \Phi_{\rm s}(t),$ $\omega_i(t), \Phi_i(t)$ are the center frequencies and phase noises of the lasers, respectively, $\overrightarrow{P_s}$ is the unit magnitude polarization vector of the signal; $\tau_i$ , $\tau_{ik}$ and $\overrightarrow{P_l}$ , $\overrightarrow{P_{lk}}$ are the propagation delay differences and unit magnitude polarization vectors of the contributions, respectively; $\varepsilon$ is the optical power ratio of each crosstalk contribution to the signal and for simplicity we assume all the crosstalk contribution have the same power., $\overrightarrow{P_s}$ , $\overrightarrow{P_t}$ and $\overrightarrow{P_{tk}}$ are treated as time invariant here as they change rather slowly compared to the bit period. Now depending on relation between $\tau_i$ , $\tau_{jk}$ , $\tau_{coherent}$ and T three cases may be considered for which the laser relative intensity noise (RIN) will get different values [7]. ### **Case 1:** If $\tau(\tau_i and \tau_{jk}) > \tau_{coherent}$ : As $\Phi_s(t)$ is uncorrelated with $\Phi_s(t-\tau_i)$ and $\Phi_j(t-\tau_{jk})$ are also uncorrelated with each other for different k. In that case the noise power can be expressed as $$\sigma_{RIN,1}^{2} = \varepsilon \sum_{l=1}^{M+N-2} \cos^{2} \theta_{l}$$ $$\cos \theta_{l} = \overrightarrow{P}_{s} \cdot \overrightarrow{P}_{l}$$ (2) where, $\theta_l$ is the polarization angle difference between the *l*th crosstalk contribution and the signal. If $\tau(\tau_i and \tau_{jk}) < \tau_{coherent}$ : Depending on the relation between $\tau$ and T two cases may arise. Case 2(a): If $$\tau(\tau_i and \tau_{ik}) \ll T$$ : As $b_s(t - \tau_i)$ equal to $b_s(t)$ approximately in this case, so coherent crosstalk do not cause noise but causes fluctuation. So, noise power will be $\sigma_{RIN,2a}^2 \varepsilon \sum_{j=2}^{N} (\sum_{k=1}^{X_j} \cos\phi_{jk} \cos\phi_{jk})^2$ (3) Case 2(b): If $$\tau(\tau_i and \tau_{ik}) > T$$ : As $b_s(t - \tau_i)$ becomes completely incorrelated with $b_s(t)$ due to unsynchronus nature of $b_s(t)$ the noise power will be $$\sigma_{RIN,2b}^2 = \frac{1}{3} \varepsilon \sum_{i=1}^{X_1} (\cos\phi_i \cos\theta_{jk})^2 + \varepsilon \sum_{j=2}^{N} (\sum_{k=1}^{X_j} \cos\phi_{jk} \cos\theta_{jk})^2$$ (4) For worst case scenario with fully loaded OXC for the above cases crosstalk may be expressed as $$\sigma_{RIN,1}^{2} = \varepsilon(M+N-2)$$ $$\sigma_{RIN,2a}^{2} = \varepsilon M(N-1)$$ $$\sigma_{RIN,2b}^{2} = \frac{1}{3} \varepsilon M + \varepsilon M(N-1)$$ (6) $$(7)$$ #### III. EXPERIMENTS AND DISCUSSIONS The detail analysis of Homodyne crosstalk is described in section 2. Figure 3 and Figure 4 shows WSXC and WIXC architectures. Inband crosstalk induced RIN due to these OXCs is given by equations 1-4 for both coherent and incoherent case. Case 1 represents the incoherent inband crosstalk while there are 2 cases for coherent inband crosstalk. Case 2a occurs when optical propagation delay differences are much less than the time duration of one bit ( $\tau$ <<T) which means $b_s(t-\tau_i)$ == $b_s(t)$ . Again case 2b represent the case when $\tau$ >T and $b_s(t-\tau_i)$ become uncorrelated completely with $b_s(t)$ as $b_s(t)$ is a random sequence and they are not synchronized. The To observe the BER performance we assumed the worst case scenario and simulated equation 5-7 incorporating the Homodyne crosstalk induced RIN into these equations. To evaluate the expression of $\sigma_{th}$ as given by equation $$\sigma_{th}^2 = (4kTB_e/L)$$ (8) we assumed, T=300K, k=1.38x10^-2^3, $B_e\!=\!10^9$ Hz and $R_L\!=\!50$ Hz. Figure 6 gives the comparative plots of BER against the signal power for WSXC OXC having number of wavelengths per channel, M=4 and separately for case1, case 2a and case2b . In plots of figure 4.1[(a)-(c)], the number of channels, N is varied as $N=[4\ 8\ 16\ 32\ 64]$ and it is found that the BER increases significantly with increase. Figure. 6(a) BER performance in presence of incoherent homodyne crosstalk (case 1) for WSXC OXC with varying number of channels and with no. of wavelengths per channel=4 Figure. 6(b) BER performance in presence of coherent homodyne crosstalk for $\tau$ <<T (case 2a) for WSXC OXC with varying number of channels and with no. of wavelengths per channel=4 Figure. 6(c) BER performance in presence of coherent homodyne crosstalk for $\tau << T$ (case 2a) for WSXC OXC with varying number of channels and with no. of wavelengths per channel=4 of N. It is also evident from the curves that incoherent crosstalk results in lower BER than that of coherent crosstalk. Similar results have been found for WIXC OXC in Figure 7 keeping all the parameters same. Here we can notice the BER curves are shifted which means power requirements are different for a specific BER in between these architectures. Figure. 7(a) BER performance in presence of incoherent homodyne crosstalk (case 1) for WIXC OXC with varying number of channels and with no. of wavelengths per channel=4 Figure. 7(b) BER performance in presence of coherent homodyne crosstalk for $\tau << T$ (case 2a) for WIXC OXC with varying number of channels and with no. of wavelengths per channel=4 Figure. 7(c) BER performance in presence of coherent homodyne crosstalk for $\tau$ << T (case 2a) for WIXC OXC with varying number of channels and with no. of wavelengths per channel=4 Figure 8 shows the plot of power penalty against number of channels due to the incoherent homodyne crosstalk induced RIN in a WSXC architecture shown in figure. The data for the calculation of power penalty is taken for a standard BER of 10<sup>-9</sup>. The plot shows that with increase of number of channels power penalty increases. The effect of number of wavelength per channel on power penalty and the Figure. 8 Power penalty as a function of number of channels for different number of wavelengths per channel for incoherent homodyne crosstalk in WSXC OXC. Power penalty is plotted to get an overall BER of 10<sup>-9</sup>. The data is obtained from figure 6 Figure. 9 Power penalty as a function of number of channels for different number of wavelengths per channel for incoherent homodyne crosstalk in WIXC OXC. Power penalty is plotted to get an overall BER of 10<sup>-9</sup>. The data is obtained from figure 7. Results are also shown in figure 8.We have got an upward shift of power penalty against number of channels for an increase in number wavelength per channel. Figure 9 gives similar plot for WIXC architecture. #### **CONCLUSIONS** We have calculated all the power penalties considering the worst case scenario, but this requirement may be relaxed if the probability distribution function (PDF) is known for the phase noise of the laser and the polarization angle differences. #### ACKNOWLEDGMENT Special thanks to Dr. Satya Prasad Majumder for the cordial cooperation. He helped a lot making us understand the topics in the easiest way. #### REFERENCES - [1] S.Okamoto, A. Watanabe, and K. Sato, "Optical path cross-connect node architectures for photonic transport network," J. Lightwave Technol., vol. 14, no. 6, pp. 1410-1422, Jun, 1996 - [2] E. Iannone and R. Sabella, "Optical path technologies: A comparison among different cross-connect architectures," J. Lightwave Technol., vol. 14, no. 10, pp. 2184-2194, Oct,1996. - [3] E.L.Goldstein, L.Eskildsein and A.F. Elrefaie. "Performance implications of component crosstalk in transparent lightwave networks," IEEE Photon Technol. Lett., vol. 6, pp. 657-660, May 1994. - [4] C. S. Li and F. Tong," Crosstalk and interference penalty in all optical networks using static wavelength routers," J. Lightwave Technol., vol. 14, pp. 1120-1126, Jun,1996. - [5] H. Takahashi, K. Oda, and H. Toba, "Impact of crosstalk in an arrayed waveguide multiplexer on N×N optical interconnection," J. Lightwave Technol., vol. 14, pp. 1097-1105, Jun,1996. - [6] Teck Yoong Chai, Tee Hiang Cheng, Gangxiang Shen, Sanjay K. Bose, and Chao Lu," Design and performance of optical cross-connect architectures with converter sharing," Optical Networks Magazine, pp.73-83,July/August 2002 - [7] Yunfeng Shen, Kejie Lu, and Wanyi Gu, Member, "Coherent and Incoherent Crosstalk in WDM Optical Networks," J. Lightwave Technol., vol. 17, no.5, pp. 759-764, May,1999 ### **Distributed Shared Files Management** #### Saurabh Malgaonkar<sup>1</sup>, Onkar Jambhale<sup>2</sup> & Manish Bhelande<sup>3</sup> <sup>1</sup> Thadomal Shahani College of Engineering, Mumbai <sup>2</sup> Ramrao Adik Institute of Technology, NaviMumbai <sup>3</sup> Vidyalankar Institute of Technology, Mumbai E-mail: <sup>1</sup>saurabhmalgaonkar@gmail, <sup>2</sup>onkar.ony@gmail.com, <sup>3</sup>manishbhelande@gmail.com Abstract - Most often file sharing is the common and basic requirement when users work on a particular domain or area of interest. Users can use software that connects in to a peer-to-peer network to access shared files on the computers of other users (i.e. peers) connected to the network. Files of interest can then be downloaded directly from other users on the network. So this concept is similar to a distributed file system where files are distributed across the network but the users have an illusion of a centralized file system and also avoids its high complexity and cost of implementation. Keywords- P2P File Access; Network Sharing; Distributed Shared File System; Large File Sharing. #### I. INTRODUCTION In a computer system a file is a named object that comes into existence by explicit creation, is immune to temporary failures in the system and persists until explicitly destroyed. The two main purposes of using files are as follows: - 1. Permanent storage of information. - 2. Sharing of information. A user creates many files on his machine and updates them accordingly if required. Access made to those files depends on the requirements of the user. Most often file sharing is the common and basic requirement when users work on a particular domain or area of interest. Example, in a hospital, the records of the patients are essential in every department so they are shared accordingly on the system. So users need to share the files that are necessary and access those shared files quickly. So file sharing becomes the practice of distributing or providing access to digitally stored information such as computer programs, multimedia (audio, images, and video), documents, or electronic books. It may implemented through a variety of ways. Common methods of storage, transmission, and distribution[1] used in file sharing include manual sharing using removable media, centralized server on computer networks, World Wide Web-based hyperlinked documents and the use of distributed peer-to-peer networking[1][2]. Users can use software that connects in to a peer-to-peer network to access shared files on the computers of other users (i.e. peers) connected to the network. Files of interest can then be downloaded directly from other users on the network. So this concept is similar to a distributed file system where files are distributed across the network but the users have an illusion of a centralized file system and also avoids its high complexity and cost of implementation. The most common and feasible approach is to use peer to peer file sharing[2] for implementing a distributed shared files management system. In addition to these advantages it will also enable to support the following: - Remote Information Sharing: It will enable to access to information that is being shared by a remote machine. - User Mobility: As the system will reflect all the files shared by the nodes present in the system, user can access them from anywhere. - 3. Availability: For better fault tolerance, the systems shared file entries are available to the users even in the temporary failure of the main directory controller. The file sharing domain is necessary and distributed thus need further justification on using peer to peer technologies[2] on that domain. Peer to peer file sharing is economically efficient When the user wants to find specific information, searching for the same would require a lot of human efforts and time. If the upcoming technologies are clubbed with the existing ones it can help better understand the whole system. Thus extending the idea of peer to peer in the file sharing environment helps better built the whole system. For accomplishing this task a directory server is used to better organize the user shared files information in the related domain. The main advantages of this approach are: - 1. Scalability: It can easily accommodate more users and hence making it more scalable. - 2. Bandwidth: It will enable to save the network bandwidth as only the required files when required are transmitted among the users. - 3. Distributed control: In this, there is a need for a central point which we name as a controller, which will manage various shared file lists from all the clients and will be handling the shared file lists distribution scenario as per the clients entry or exit in the network. - 4. Fault Tolerance: The plan is to make this file sharing scenario fault tolerant so a replica controller will be maintained that will be frequently updated from the primary controller so even if the primary controller fails the system continues to operate. The goal of this project is not only to achieve a distributed shared files management system that will allow clients that are distributed location wise to share files among themselves but also to give better performance in terms of file access. A diagrammatic interaction of the user with the system is as follows: Figure 1. User Interaction with the System When a user interacts with the system when joined to a particular network, the user adds the file entry that needs to be shared among the clients in the network. The user mentions the category of the file and also adds its description so the other users are aware about the contents of the file. A user can also search for a particular file entry from the required parameters (name, category or description) as the users in a network can share hundreds of file entries and it is impossible to look for a particular entry manually. Once the user finds the required file entry, the user with its help can access the file by receiving it from the client who is sharing that particular file. A common interface in each client also lists the overall shared file entries of all the clients in the network for access along with the search mechanism. # II. TYPICAL P2P DISTRIBUTED FILE SHARING SYSTEM We present a literature review which includes the basic file sharing details, distributed file system, various technologies, existing communication protocols etc. The development of file sharing system has triggered two to three decades ago. As the various file applications grew from hundreds to thousands to millions, the interest along with the resources for file sharing. File sharing began in 1999 with the introduction of Napster[3][4], a file sharing program and directory server that linked people who had files with those who requested files. The central index server was meant to index all of the current users and to search their computers. When someone searched for a file, the server would find all of the available copies of that file and present them to the user. The files would be transferred between the two private computers. While Napster connected users through a directory server, these new services connected users remotely to each other. These services also allowed users to download files other than music, such as movies and games. One limitation was that only music files could be shared. After Napster shut down the most popular of these new services was Gnutella[5]. #### A. P2P System Advantages In [7] the key features in P2P file transfer are highlighted. Using P2P technique, execute nodes can share data with each other instead of fetching files only from the central manager or a file server, saving plenty of time. In a typical file sharing environment which follows a centralized approach where the files need to be shared are transferred to the central server and the nodes accessing them from it has its great disadvantage. In case of the failure of the central server the entire system comes to a standstill and when the system is operating on heavy load the central server becomes a bottleneck. Also it does not reflect the updated file entries, so when a user updates a file, it needs to be uploaded again to the central server. So this paper suggest a peer to peer sharing where all the major disadvantage of a centralized sharing system have been taken care off. #### B. A Basic P2P File Sharing System In [8] a reliable and simple P2P file sharing system is described which avoids unnecessary data redundancy and connectivity issues among peers by maintaining an adapter which optimizes the working of the entire file sharing system. This approach also makes it highly scalable in nature. Since a file resides at local node and is shared only when required there is no need for separate update policy, the shared file itself reflects the updates version of the file all the time as the user updates it. The adapter maintains a list of all the files that are shared by the users and updates it accordingly then sends it to all the users. When this list is imported by the users they can perform file sharing operations among themselves. #### C. Reilablity and fault tolerance The disadvantages of a client server file system which do not scale with respect to the number of users and exhibit a single point failure are further highlighted in [6]. So the focus is more on the distributed peer to peer aspect rather than a centralized one. The important aspect of this paper is fault tolerance achieved by replicating data, hence the data being available even in case of temporary failures. The other aspect to be considered is user mobility. #### III. DSFM DESIGN It is necessary to keep the system in a constant flow and achieve the targeted goals of the proposed system at the same time. So the after studying carefully all the literatures [6][7][8] and highlighting the key features and drawbacks from them, the overall workflow of the proposed system will be as follows: Figure 2. Overall System Workflow The three most important components of the system are: - 1. Client: The client system will allow the clients to share the files that are required as well as provide an interface that will allow the client to access the files shared by all the nodes in the system(centralized view of shared files). When a client joins the network its shared files will be added to the system and when the client leaves the network all his shared files entries are discarded from the system. After a client receives the updated shared files list, it can access the file from the respective clients. - 2. Controller: The main controller will store the address and shared files information of the client peers and will be responsible for distributing them to all the clients in the network. It's basic task is just to index the file entries from all the clients and distribute them accordingly. Also when a client joins or leaves the network it will update its shared files entries accordingly and inform the remaining clients. So through the controller we will be able to achieve the scenario of overall file sharing. - 3. Replica Controller: The project plans to replicate the main controller so even when the main module fails temporary the system does not comes to a standstill, the requests can be handled from the replica controller and the system continues to operate. So our system will be fault tolerant. #### A. General Basic Functionality The following diagram denotes the basic interaction among the modules and functionality of each module. This is the normal scenario highlighted when the system is working with the primary controller when fully functional. Figure 3. General Flow Chart #### B. File Sharing Process The following diagram illustrates the scenario that enables to achieve the basic file sharing process among the various clients in the network. Figure 4. Basic File Sharing Process #### C. Fault Tolerance Scenario The following diagram illustrates the scenario about how the clients detect the temporary no response or failure of the primary controller and redirect to the replica controller. Figure 5. Handling Primary Controller Failure #### REFERENCES - [1] Kenneth P. Birman (2005), "Reliable distributed systems". - [2] Rüdiger Schollmeier, "A Definition of Peer-to-Peer Networking for the Classification of Peerto-Peer Architectures and Applications", Proceedings of the First International Conference on Peer-to-Peer Computing IEEE, pp.149-160, 2002. - [3] Menta, Richard, "Napster Clones Crush Napster. Take 6 out of the Top 10 Downloads CNet", MP3Newswire,http://www.mp3newswire.net/stories/2001/topclones.html, July 20, 2011. - [4] Ante, Spencer, "Inside Napster". Business Week. Retrieved 2011-04-10, http://www.businessweek.com/2000/00\_33/b3694 001.htm, June 7,2011. - [5] "Gnutella File Sharing", http://slashdot.org/story/00/03/14/0949234/Open-Source-Napster-Gnutella, 15 August 2011. - [6] Sunil Chakravarthy and Chittranjan Hota, "Secure Resilient High Performance File System for Distributed Systems", IEEE International Conference on Computer & Communication Technology, pp. 87-92, 2010. - [7] Gao Ying, Liu Guan Yao and Huang JianCong, "A New Method of File Transfer In Computational Grid Using P2P Technique", IEEE International Conference on Network Computing and Information Security, pp. 332-336, 2011. - [8] Rui Zhao, Ruhua Liu and Guangxuan Fu, "P2P File Sharing Software in IPv4/IPv6 IEEE International Conference on Software and networks, pp. 367-370, 2011. ## Location Cache Design Using Way Prediction Method for Chip Multiprocessors #### Jyoti Guttedar, G.Jyothi & M.Z.Kurian Department of EC, Sri Siddhartha Institute of Technology, Sri Siddhartha University, Tumkur, Karnataka, India Email: jyotiec23@gmail.com, grandhejyothi@gmail.com, mzkurianvc@yahoo.com Abstract - Recent research at Intel suggests that chips with hundreds of processor cores are possible in the not-so-distant future. As the number of cores grows, so does the size of the cache systems required to allow them to operate efficiently. Caches have grown to consume a significant percentage of the power utilized by a processor. In this research, we extend the concept of location cache to support chip multiprocessors (CMPs) systems in combination with low-power L2 caches. Keywords—Cache architecture, dynamic and leakage power dissipation, location cache, low-power design, power analysis. #### I. INTRODUCTION In recent years, microprocessor companies have had difficulty in increasing the performance of CPUs by simply increasing their clock frequencies. Research has moved to parallelism in an effort to maintain performance increases [1], [2]. It is now increasingly common for multiple processor cores to be included on a single silicon die, creating what is called a chip multiprocessor (CMP). A CMP typically contains multiple cores operating at the same clock frequency, and those cores tend to share at least part of their cache system on the chip. For example, an Intel Xeon MP processor contains a pair of processor cores, and includes a cache system consisting of private L1 and L2 caches in addition to a large L3 cache that is shared between the cores [3]. As cache systems have grown in size to satisfy the additional needs of these CMP systems, so does the amount of power they consume. Several techniques, such as subbanking [4], [5], bitline segmentation [4], and phased cache [5]–[7] are commonly used to reduce the amount of dynamic power used by a cache. In addition, the increased power consumption may also lead to thermal issues on the chip, and design must proceed carefully in order to eliminate potentially-damaging hot spots. Several different techniques, such as gated-Vdd, drowsy caches, and DRG-caches, have been presented to reduce sub-threshold leakage power. A location cache is a small direct-mapped cache that stores information relating an address to its location in the target cache [8]. This capability can save dynamic power upon cache reads and writes. More importantly, this behavior is capable of being exploited when used in combination with low-leakage techniques to save a significant amount of leakage power. #### II. BACKGROUND This section discusses the working principal and design of the location cache concept [8], for a single-core processor. Here, we assume that L2 is the highest level cache. #### A. Structure of Location Cache The location cache shown in Fig. 1 is a small direct-mapped cache, using address affinity information to provide the accurate location information for L2 cache [8]. The proposed location cache technique reduces the L2 cache power consumption, when compared with a conventional set-associative L2 cache. Depending on the L2 cache architecture, a location cache can be physically addressed or virtually addressed. Fig. 1 illustrates a revised L2 cache system architecture with a location cache, which is physically addressed. In this physically addressed cache system, the location cache is physically addressed as well. It caches the access way location information of the L2 cache (the way number in one set where a memory reference falls). This cache works in parallel with the L1 cache. As a location cache tries to cache the L2 location information, the block address (composed of the index address and the tag address) of the location cache should be of the same length as that of the L2 cache. Fig.1. Physically addressed location cache architecture #### B. Working Principle of Location Cache The proposed cache system works in the following way. The location cache is accessed in parallel with the L1 cache. If the L1 cache sees a hit, then the result obtained from the location cache is discarded. If there is a miss in the L1 cache and a hit in the location cache, the L2 cache is accessed as a direct-mapped cache and the access power of the L2 cache will be greatly reduced .When both the L1 cache and the location cache see a miss, the L2 cache is accessed as a conventional setassociative cache and the content (i.e., the new way information) of the location cache is updated. When the location cache stores the location (way) information of the L2 cache, it uses the same block address as the L2 cache, instead of the L1 cache. As opposed to the wayprediction methods, the cached location is not a prediction. Even if there is a location cache miss, we do not see any extra delay penalty as seen in wayprediction caches. Normally, the block size of the L2 cache is larger than that of the L1 cache; for instance, in Intel Itanium 2, the L1 block size is 64 bytes while the L2 block size is 128 bytes. Due to this difference in L1 and L2 block sizes, the location cache can still catch many references which are L1 misses but location cache hits. #### III. LOCATION CACHES ON CMP SYSTEMS Previous works utilizing location caches have been limited to single processor systems [8]. With multicore chips becoming increasingly prevalent, the concept of location cache needs to be adapted to these new types of systems. The following discussions describe two approaches to creating location caches capable of functioning within CMP systems. #### A. Shared Location Caches in CMP Systems The most straightforward approach to adding a location cache to a CMP system involves sharing. Multicore processors commonly share an L2 or L3 cache among all of the cores. Similarly, it is possible to create a single location cache capable of being accessed by each of the cores in the system if those cores share a cache at some level. For example, if all four cores share a single L2 cache, these cores can be served by a single location cache. If those four cores instead share a pair of L2 caches, a shared location cache approach can be implemented in two different ways: 1) cores sharing the same L2 cache also share the same location cache as shown in Fig. 2 and 2) all cores share only a single location cache. Fig. 2. CMP cache system using shared location caches. In the case where the highest-level cache is L2, the location cache operates on every access initiated by every processor core it serves. The source of the access is completely disregarded, and only the transaction's address is taken into consideration. We only focus on the cache system shown in Fig. 2 with four cores sharing two location caches, since this architecture has been applied by commercial multi-core processors. Though this architecture is in fact a semi-shared location cache system, for each L2 cache, it is a purely shared location cache system. With all cores sharing a single location cache is not practical, because this architecture suffers location cache line replacement problems as will be discussed later. That is, too many cores are fighting for a limited number of location cache lines and this makes the single location cache almost useless. Further, the (single) location cache has to store extra information such as the L2 where a specific data exists. Assume Core 0 initiates a memory access. Its L1 cache and shared location cache LCache0 parse the tag and index information and check for matches. If the L1 cache hits, the result of the location cache access is ignored and L1 returns the requested data to the core. If the L1 cache misses, the result of the location cache access determines how to proceed. If the location cache also hits, the way information stored in the location cache is used to access the L2 cache as if it were directmapped. If the location cache hits, a hit in L2 is guaranteed. If the location cache misses, L2 is accessed in its normal set-associative manner and the new way information is provided to the location cache for future use. Note, however, that a miss in the location cache would not necessarily indicate a miss in the L2 cache. Now assume Core 1 attempts to access the same memory address, and it is not found in its L1 cache. The way information for this address was previously stored in the location cache by Core 0, and can now be used to access L2 as a direct-mapped cache. Let us consider another example. Assume that the L1 caches of both Core 0 and Core 1 have cached the same line of data. Core 0 now performs a write to this address, changing the data. At this point the MESI protocol triggers the L1 of Core 1 to change the corresponding line's state from Shared to Invalid, and the newly-updated line in Core 0's L1 will retain its Shared status. Note that the corresponding line in the location cache does not need to be removed or modified, as it still points to the correct location in L2. If Core 1 now tries to access this address again, it will find that its own copy in L1 is marked Invalid. It can now use the shared location cache, which still knows the location of the line in L2, to access L2 as if it were a direct-mapped cache. In this case a location cache can be very useful for programs that require multiple cores accessing the same memory location. Other than the simplicity of implementation, the other advantage to this configuration is that it lacks any coherency issues. Since all of the memory transactions sharing the same L2 cache pass through the same location cache, no additional implementation changes are required to keep the location cache and its target cache coherent. As a matter of fact, the location cache associated with each L2 cache just stores whether a specific data is in L2. If yes, what is the way number. The coherency mechanisms implemented in L1 and L2 are still serving their purposes. Thus, the function of location cache and coherency protocols implemented in L1 and L2 are orthogonal. In addition, with multiple cores expecting complete accesses to a location cache, the location cache will need a read and write port for every core supported. Simultaneous accesses to the same line in the location cache will increase latency, and reduce the efficiency of the location cache itself. Resolving these issues will increase both the complexity and power consumption of the location cache. #### **B.** Private Location Caches in CMP Systems A cache system utilizing a private location cache for each core is shown in Fig. 3. Fig. 3. CMP cache system using private location caches. When used in a CMP system, the simplistic location cache design is prone to incoherency when multiple location caches assist a single target cache (e.g., L2). For example, if Core 0 writes a value into its L2 cache, the way information is then stored in Core 0's location cache (LCache0). Later, if Core 1 writes a value into the L2 cache it shares with Core 0 and evicts this previous entry from L2, Core 0's location cache now points to data that is no longer present. Due to this possibility, it is necessary to extend the concept of location cache presented above for a CMP system. Here location cache design is extended with a simplified version of the MESI protocol. This modification can be performed by adding only a single additional bit to each line in the location cache. This bit will determine whether the location cache line is in a Shared or Invalid state as shown in Fig. 4. A line in a location cache is marked Invalid if the line it references is no longer present in the target cache or if the line has not yet been written to. In all other cases, the line is marked Shared. Fig. 4. Location cache coherency-state transition. When a location cache lookup is performed, in order for a location cache hit to occur, the following two conditions must be satisfied: - 1) requested line must be present in the location cache; - 2) requested line must be marked Shared. This alteration does not come without a cost. In order to perform this operation, additional care needs to be taken when lines are evicted from the target cache. When such an eviction occurs in the target cache, the tag and index portion of this reference is passed to each of the connected location caches. The location caches then check if they contain lines matching the newly-evicted entry. If the evicted transaction address is not present in other location caches, no further operation is performed. However, if the evicted transaction address is present in the other location caches, the lines in the location caches are marked Invalid by transition "LOC\_INV" (i.e., location cache invalid) as shown in Fig. 4. This will prevent future use of these location cache lines, which will be overwritten at the next opportunity. The operation of location caches here is a little different from that of the shared case. When the way information is stored for a transaction initiated by Core 0, for example, it is stored in Core 0's private location cache. This location cache, LCache0, cannot be read from or written to by Core 1. This alleviates the problem where each core is constantly overwriting each other's cache information, and results in increased location cache hit rates. However, let us say Core 0 has way information for a transaction stored in its private location cache. Now Core 1 performs an access that ultimately results in the line pointed to by Core 0's location cache being evicted from L2. Core 0's location cache now points to an address that no longer exists in L2. Here our coherency protocol would require L2 to transmit the address of the evicted L2 line to each of its connected location caches. If that address is present in any line of a location cache, the location cache line is marked Invalid. This ensures our private location caches remaining coherent. It should be noted that the location cache line is updated and changed to the shared state only when a message from the L2 cache is received back at the L1 cache again. The way information of the particular block in the L2 cache is sent along with the message that is sent from the L2 cache to the L1 cache. The update occurs when any data message is received by the L1 cache from the L2 cache. #### IV. CONCLUSION AND FUTURE WORK Here the power savings realized by utilizing location caches in a CMP system is analyzed. The working principal of location cache for a single-core processor is reviewed, and extensions to this principal are proposed to allow location caches to support a CMP system. It is found that the amount of power saved by adding location caches varies quite significantly depending upon the setup of the tested parameters. The tested location caches were able to save power over all tested configurations and benchmarks, though they were far more effective at reducing the amount of leakage power than dynamic power. The number of entries in each location cache displayed a surprisingly small effect on the cache's overall power savings, with only about six of the benchmarks showing sensitivity to this parameter. Looking towards possible future work, we suggest the following: 1) extension of this work to a cache system connected by network-on-chip (NoC); 2) learning the effect of OS techniques on the power savings provided by location caches in CMP systems; 3) extension of the current model to efficiently serve CMPs with dozens or even hundreds of cores; 4) extension of location caches to deal with exclusive caches; and 5) identifying the reasons why some benchmarks such as Radix(Ocean) seem to highly (or least) benefit from location caches. #### V. ACKNOWLEDGEMENT I hereby acknowledge to my project guide G.JYOTHI, LECTURER, department of E&C, S.S.I.T for her guidance, constant encouragement and wholehearted support. My special gratitude to Dr.M.Z.KURIAN, HOD, department of EC, S.S.I.T for his guidance, constant encouragement and wholehearted support to make this project a success. #### REFERENCES - [1] S. Vangal *et al.*, "An 80-Tile 1.28TFLOPS network-on-chip in 65nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf.*, 2007, pp.98–589. - [2] R. Varada, S. Tam, J. Benoit, and K. Chou, "SOC design challenges in a multi-threaded 65 nm dual - core Xeon MP processor," in *Proc. Int.SOC Conf.*, 2006, pp. 217–220. - [3] S. Rusu *et al.*, "A 65-nm dual-core multithreaded Xeon processor with 16-MB L3 cache," *IEEE J. Solid-State Circuits*, vol. 42, no. 1, pp. 17–25, Jan. 2007. - [4] K. Ghose and M. B. Kamble, "Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation," in *Proc. Int. Symp. Low Power Electron. Des.*, 1999, pp. 70–75. - [5] C. Su and A. Despain, "Cache design tradeoffs for power and performance optimization: A case study," in *Proc. Int. Symp. Low Power Electron. Des.*, 1997, pp. 63–68. - [6] T. Lyon, E. Delano, C. McNairy, and D. Mulla, "Data cache design considerations for the itanium2 processor," in *Proc. IEEE Int. Conf. Comput. Des.: VLSI Comput. Process. (ICCD)*, 2002, pp. 356–362. - [7] A. Hasegawa, I. Kawasaki, K. Yamada, S. Yoshioka, S. Kawasaki, and P. Biswas, "Sh3: High code density, low power," *IEEE Micro*, vol. 15, no. 6, pp. 11–19, Dec. 1995. - [8] M. Rui, W. B. Jone, and Y. Hu, "Location cache: A low-power L2 cache system," in *Proc. ACM/IEEE Int. Symp. Low Power Electron. Des.*, 2004,pp,120-125. ## Design and Implementation of Secure Low Cost Novel Remote Metering System Using GPRS #### Nagaraj A. Hanchinamani & Chandrakala. V Dr. Ambedkar Institute of Technology, Bangalore – 560056 \ E-mail: nagarajhanchinmani@gmail.com, v chandu9@yahoo.co.in Abstract - Gradual improvement of Automatic Meter Reading System (AMRS) technology could be one-step ahead by using GPRS/EDGE modem. AMRS based on GPRS modem provides many advantages compared to other remote metering techniques. it uses TCP/IP and PPP protocol for communication and GPRS is an extension of GSM provides higher data transmission speed and more security. The benefits of implementing such system include robust data transmission, wide area network coverage, lowering power consumption, cost-effective and reliable. An interface system (IS) has been developed that can be plugged in existing digital energy meter and a new meter can also be developed with this IS. The proposed IS collects meter data and send that data to the server using GPRS modem. Each IS is capable of two-way communication. To send meter data existing GPRS networks have been used. This paper describes the detail hardware design of an interface system, step by step procedure of TCP/IP, PPP, and USB protocol implementation. The paper also shows the data transmission and reception with server end in real world. Keywords: AMR, GSM, GPRS, IS,TCP/IP. #### I. INTRODUCTION Electricity is the source of power behind the development of any country. Due to increase number of power consumers in every sector such as residential, commercial and in industrial and scarcity in fossil fuel, it is essential to ensure proper use of energy and to generate correct bills and invoices and to reduce corruption. The conventional method of collecting meter data is done manually by assigning a person. It may involve dilemma such as human error and corruption. The Automatic Meter Reading System (AMRS) has completely changed the process of collecting meter readings. There are two types of AMR system: wirebased and wireless. Power Line Carrier (PLC) and Telephone Line Network are wire-based AMR system. The problems of wire-based AMR system are transmission distance, transmission cost, maintenance and security of data transmission. GSM, GPRS, WiFi, WiMax are the typical wireless based AMRS system. The wireless based AMRS system provides higher data collection speed and more efficiency. As there is no human intervention in the entire process, there is no chance of human error and corruption. In addition to this, the meter reading can be collected after any desired time interval such as hourly, daily, weekly, or monthly basis. Moreover, the electric supplier can take advantage of the wireless communication companies for remote monitoring and providing information to the customers anytime and anywhere. The Retail Providers will also be able to offer new innovative products in addition to customizing packages for their customers. In Automatic meter reading system, it is very essential to develop a proper networking mechanism where the data transfer will be in high speed, will provide great security and will be cost-effective. Recently, there are many on going research experiments for implementation of electronic meters and the utilization of existing telecommunications systems to transmit meter reading automatically in fast, secured and accurate manner. The communication networks like the internet, GSM/GPRS networks provide useful means of communication due to its good area coverage capability and cost effectiveness. GPRS also supports for leading communication protocol such as IP and X.25 and the most important step GPRS is on the path to 3G. If SMS over GPRS is used, an SMS transmission speed of about 30 SMS messages per minute may be achieved. This is much faster than using the ordinary SMS over GSM, whose SMS transmission speed is about 6 to 10 SMS messages per minute. The gradual improvement of AMRS could be one-step ahead by using the USB GPRS/EDGE modem. However, USB modem utilize the TCP/IP and PPP protocol for communication and TCP/IP is a connection-oriented protocol, which provides reliable, secure and fast data transmission. The reliability and speed of communication could also be one-step ahead by interfacing USB modem rather than other modems for communication. #### II. AMRS-SYSTEM OVERVIEW In this project, an interface system (IS) has been developed which will communicate with the digital meter and a USB GPRS/EDGE modem. The IS generates meter reading based on the meter pulses and sends that reading to the server database. Each IS is capable of two-way communication hence the meter can be controlled form the server for some specific purposes. The billing office should have a highly secure database system through which only the authorized staff members of the electricity supply company able to read and print electricity bills [9]. Other systems may be connected to the server for further process such as the online billing operation, providing security code, alarm system, temper detection [3], [10]. The system architecture of automatic meter reading system has given below in figure-1 through a block diagram. Figure: 1: Block Diagram of Automatic Meter Reading System (AMRS) #### A. Interface System The interface system consists of a MCT2 Optocoupler, Microcontroller (MCU), modem and power regulation unit. #### MCT2 Optocoupler Interfacing can be done in many ways with the meter – by hacking the MCU of the meter and then capture the data, or by decoding the display output or by counting the pulses of the meter, or placing a parallel meter with the actual meter embedding an AMR system. Counting pulse and using that counting value to generate meter data is a great solution. So in the IS, light dependent resistors is used to sense the LED blinking. Generally two basic sensor circuits can be developed. The first is activated by darkness; the second is activated by light [11]. In this project, a light activated sensor circuit has been developed for a number of reasons to reduce the interference between the circuits, simultaneous separation and intensification of a signal and high voltage separation. #### • The Core – Microcontroller Unit The main function of the microcontroller unit (MCU) is to control the communication among the remote unit, the modem, and other different components. To select a suitable MCU for this project several matters have to keep in mind such as program memory size and type, speed, connectivity, USB On-The-Go (OTG) compatibility, analog to digital converter (ADC) features, USART etc. PIC24 and PIC32 series by Microchip MCU have been chosen because it satisfies the requirements of the project such as low power consumption, two-wire communication port, full duplex UART, ADC etc. #### Interfacing with Modem. Interfacing the USB device with microcontroller is typically based on the operating mode of microcontroller whether the microcontroller is in host mode or it is in device mode. If the microcontroller is in device mode then again two conditions arises that it is Bus power device or Self power device. If the microcontroller is in Host mode then the microcontroller has to be run with either external 3V to 3.6V power supply or external 5V power supply. #### • Power regulation unit. For a proper function of any microcontroller, it is necessary to provide a stable source of supply, a sure reset when you turn it on and an oscillator. According to technical specifications by the manufacturer of PIC microcontroller, supply voltage should move between 2.0V to 6.0V in all versions. The simplest solution to the source of supply is using the voltage stabilizer LM7805 which gives stable +5V on its output. The power regulation unit also provides power to other part of the circuitry. #### B. Counting the pulse of energy meter To count the meter pulse, ADC module of MCU has been chosen. The output from the Optocoupler unit will go to the ADC input channel of MCU. The MCU then performs ADC conversion on this input. The ADCH and ADCL registers will keep this conversion result. After the conversion meter data is count to generate the amount of power usage. Figure :2: Implementation and testing of interface system # III. NETWORK MANAGEMENT AND CONNECTIVITY Today internet is used in embedded system to control and monitor equipments. Several protocols are used for this purpose such as HTTP, PPP and TCP/IP. As the small 8-bit MCU holds very small memory space the conventional structure of these protocol are not applicable for embedded system. In general, the microcontrollers are made up with 128kbytes to 256kbytes of memory space. Therefore, to configure network in MCU only the minimum requirement is used to establish a protocol. Typical embedded IP stacks range from 14kB up to and exceeding 500kB [18]. However, several protocols and works should have under consideration for connection establishment. #### A. Modem initialization Each modem will have different initialization parameters called Hayes AT command that must be sent to the modem. The application simply requests the modem to dial the server using Hayes AT command [19]. The initialization step performs of checking the connection between modem and MCU. as shown in below fig2. Figure :3: Flow chart for Modem initialization #### **B.** Activate GPRS Connection Each mobile network will have different parameters that must be set on the modem before connecting to their GPRS network. These parameters are Access Point Name (APN) and Access Number. The AT+CGDCONT command is used to set the APN and Packet Data Protocol (PDP) type as IP on the modem [20]. After that, ATDT\*99#" command is sent finally to connect with network. As shown in the below fig3. Figure:4: Flow chart for GPRS activation Figure :5: Flow chart for Meter Reading Terminal #### C. Establish Point to Point Protocol (PPP) PPP is a set for various other protocols where each of the protocol is negotiated independently. As the PPP need to send these protocols over point-to-point links, it uses a special frame structure to encapsulate the PPP packets. International Organization for Standardization (ISO) has defined the High-Level Data Link control (HDLC) frame structure for PPP in ISO 3309 [21]. A summary of the PPP HDLC structure can be found in RFC 1662 [22]. To create a PPP connection on an embedded system at least Link Control Protocol (LCP), Internet Protocol Control Protocol (IPCP), and a user authentication protocol Password Authentication Protocol (PAP) or Challenge-Handshake Authentication Protocol (CHAP) have to be negotiated [23]. #### LCP PPP first sends LCP packets to configure and test the data link. The working procedure of LCP is it sends requests, acknowledgements, and negative-acknowledgements to negotiate the defined options within the packet. There are several options to configure the LCP packet can be found in RFC 1661. When the both side of the link have agreed to its peers configuration the fundamental link is established. #### PAP The authentication process starts after the initial link establishment through the LCP process. The PPP authentication is described in details in RFC 1334 [24]. The User ID and Password is sent repeatedly to the server until the server acknowledge. However, the authentication method depends on the ISP. If the ISP uses authentication method then it is necessary to send its username and password. #### IPCP With IPCP the local IP address, primary and secondary DNS addresses, Gateway, and Net mask configuration are negotiated and established. The IPCP is described in details in RFC 1332 [25]. There are a number of configuration options that can be negotiated. However, for minimum requirement at least three configuration options should keep under consideration to request the server for remote IP and DNS addresses. #### D. TCP/IP Stack TCP is one of the most reliable and connection oriented protocol. It sends packet through segmentation, checksum calculation, addressing, and flow control. The TCP/IP stack typically consists of IP, UDP and TCP protocols [26-28]. The TCP/IP stacks those are implemented for embedded processors use a simplified model of the traditional TCP/IP stack to reduce the code size as well as the memory utilization. For embedded applications, a single global buffer is used in which the device driver puts an incoming packet. The buffer can hold a packet of maximum size that is defined for it. For receiving case, when a packet enters from the server, the device driver puts it in the buffer and calls the TCP/IP stack. If the packet contains data, the TCP/IP stack will notify the corresponding application. For sending case, when the application sends data to the server, first it goes into the buffer. Then TCP/IP stack calculate the checksums, and fill in the necessary header fields on that data and finally send the packet to the server. #### E. HTTP A well-known protocol that is used for communication between servers and web browsers is the Hypertext Transfer Protocol (HTTP). The HTTP (1.1) is described in RFC 2616 [29]. This application layer protocol allows the user to send various types of requests to server. The most basic requests are the "GET" request and the "POST" request that are used to take and post data, respectively. The HTTP commands are called methods; the command that used to fetch data is called the GET method and another is the POST method to post the data. For embedded application, in the firmware, get method or POST method can be used to send and receive data to or from the server. #### IV. THE BILLING SERVER The collected power consumption reading is sent to the central billing server where it is stored. Many commercial servers as well as management software are available in the market. However the cost of such server and software management system are very expensive. To decrease the cost of the proposed AMR system, inhouse software is developed using JAVA Technology and is used to control the central server. The implemented meter data management system will have the following functions: - a) Remote metering: The meter reading is sent automatically to the server and customers can remotely get their consumption at any time. - b) Bill issuing: The billing system shall provide monthly bill for customer who does not remotely access the server. - c) Customer tracking: The billing should include better customer tracking, bill forwarding identification of customer financial accounts information, and use of monetary deposits for account closing requirements. - d) Apply different tariff for different customers: Houses, schools, factories are treated different and the bill should be calculated according to the corresponding tariff assigned by electricity authority in karnataka #### V. APPLICATION - Automatic per day bill generation and monitoring of meter. - Used for water meter control and monitoring - System - Industrial electrical energy conservation. - Used for distribution and maintains in supply sector #### VI. CONCLUSION In this paper, a remote metering system based on USB GPRS modem has been discussed. To keep pace with the present technology implementation of USB communication with the Energy meter facilitates more data transfer with more speed between remote meter and server. Here different issues of hardware and firmware have been discussed that need to establish the interface system to communicate with energy meter. The existing meter in the market can utilize this interface system to transfer meter reading and a new meter can also be developed with this interface system. However, with this implementation any meter equipment such as Water meter and Gas meter as well as any data acquisition system can be used with necessary modification to transfer the data to a server. #### REFERENCES - M. Y. Nayan and A. H. Primacanta, "Hybrid System Automatic Meter Reading," 2009. - [2] S. W. Lee, C. S. Wu, M. S. Chiou, and K. T. Wu, "Design of an automatic meter reading system," 1996, pp. 631-36. - [3] P. Oksa, M. Soini, J. Nummela, L. SydÄNheimo, and M.Kivikoski, "Reliability and usability in data Transmission networks of the AMR system: a pilot stydy," 2005, pp. 388-393. - [4] Wikipedia. "AMR," Octobor 08,2010; http://en.wikipedia.org/wiki/Automatic meter reading. - [5] R. A. Fischer, N. Schulz, and G. H. Anderson, "Information management for an automated Meter reading system," 2000, pp 150-154. - [6] P. Bharath, N. Ananth, S. Vijetha, and K. V. J. Prakash, "Wireless automated digital energy meter," 2009, pp.564-567. - [7] U. C. Technology, "General Packet Radio Service." - [8] gsmfavorites. "GSM/GPRS Modems and GSM/GPRS". - [9] T. Jamil, "Design and Implementation of a Wireless Automatic Mete Reading System," 2008. - [10] A. Minosi, A. Martinola, S. Mankan, F. Balzarini, A. N.Kostadinov, and A. Prevostini, "Intelligent, Low-power and low-cost. - [11] V. Ryan. "Light Dependent Resistors," Octobor 08, 2010; http://www.technologystudent.com. ## **Encryption And Decryption of Message Using Android Cell** #### Tanuja & M.Z Kurian Dept of EC,SSIT,Tumkur E-mail: tanuja212@gmail.com mzkurianvc@yahoo.com Abstract - This paper presents a Encryption and Decryption of Message using Android Mobile. The suggested system is concerned with applying software Engineering techniques to cryptographic systems. In particular we evolve our existing cryptographic system to incorporate new cryptographic concepts that strengthen the system. The language chosen is Java for developing android application for Encryption of message and the objective is that the Java developer can easily use the resulting system with minimal knowledge of the underlying machinery. In order to improve the security of the private information, an encryption algorithm based on the ASCII code, is proposed in this paper. We design and realize an encryption system based on the ASCII code of character for Android application, which can encrypt the information. In this paper we design an decryption system based on the algorithm on ARM7, which can decrypt the information. Keywords-Android; ASCII code; ARM7; #### I. INTRODUCTION The word Cryptography comes from Greek "Kryto"(hidden) and "grapho"(towrite). It is the science of hiding the meaning of information. Generally speaking, it can be synonymous with the conversion of information. It is usually applied to avoid unwanted people reading the information. Prior to the early 20th century, cryptography was chiefly concerned with linguistic and lexicographic patterns. Since then cryptography intersects the disciplines of mathematics, computer science and engineering, derived using mathematical algorithms and implemented using software that runs on computers or embedded processors. These new forms of cryptography are strongly driven by rapid advances in computer communications technologies. Cryptography becoming necessary when sensitive data is being transacted over any un -trusted medium. It provides the services such as keeping secrets from an unexpected audience, authentication with a signature, verification of data integrity, and security certificates for the communications. With the development of the digital devices, computers and networks, our world relies more and more on the digital data. In many cases, storing data safely is a very big concern .These data have to be protected so as to prevent the possible unauthorized access. Many technologies have been used to improve the security of the data . Such as authentication, audit trail and access control [2]. But all of these models have not encrypted the original data, once the HDD is accessed, the information in it can be possessed by the invader. In this paper, we designed and implemented an Android application for encryption of message based on ASCII code of character. to deccrypt message based ARM7. And we have tried our best to improve the speed of encryption and security. Aim of this proposed system is to develop an android application to encrypt the message and send it to the receiver section. In this paper we are using an android mobile. Java programming is used to develop the application using eclipse IDE. Once after the installation of application into the android mobile, when you click on that application first it will display "enter text", once you enter the message then you need to enter the cell phone number to which you need to send encrypted message upon clicking the "send" button. The message will be encrypted and sent to that particular number. At the receiver side we are using GSM Module interfaced with the ARM controller which receives the message, decrypts and displays it on the LCD which is interfaced to the ARM. #### II. RELATED WORK In order to improve the security of the private information in memorizer, an encryption algorithm, which inherits the advantages of chaotic encryption, stream cipher and AES algorithm, is proposed here. We design and realize an encryption system based on the algorithm on ARM(S3C6410), which can encrypt and decrypt the information in many kinds of memorizers, such as UDisk, SD card and mobile HDD. The system that uses Human-Computer Interaction and Visualization technology provides several encryption algorithms and key generators. Encryptor/Decryptor Asinglechip core implementation of Advanced Encryption Standard(AES-Rijndael) cryptosystem. The suggested architecture is capable of handling all possible combinations of standard bit lengths (128,192,256) of data and key. The fully rolled inner pipelined architecture ensures lesser hardware complexity. The architecture does reutilize pre computed blocks, in the sense that the same hardware is shared during encryption and decryption as much as possible. The design has been implemented on Xilinx XCVe1000-8bg560 device. The performance of the architecture has been compared with existing results in the literature and has been found to be the most efficient (throughput/area) implementation of the AES algorithm. Here presented a single chip encryptor/decryptor of reconfigurable AES algorithm. The design exploits the theory of composite field arithmetic GF(((22)2)2) to compute all nonlinear operations of S-boxes and thus optimizes the hardware complexity. It does reutilize precomputed blocks. The same hardware is shared in encryption and decryption as much as possible. After exhaustive survey in literature we have seen that this is the first work of single chip encryptor/decryptor core implementation of AESRijndael which can work under any possible (128, 192 and 256-bit) key or data bit frames. #### III. PROPOSED SYSTEM To design a technology which can work both on the cell phone and the ARM embedded micro controller to have a safe and secured communication in the real time environment. This has lot of advantages and the message is made to remain integrated using the encryption and the decryption technology and we are using the latest operating system in the cell phone to implement the same in the real time environment. In this paper wanted to work on the latest ARM technology and at the same time would like to work on the latest Operating System in the real time environment. By working on this project i also get to learn about the latest Beagle board and its interface techniques. To port the android on the beagle board and design and develop an encryption and the decryption using the beagle and the ARM board using the GSM technology. # IV. TECHNOLOGY USED IN PROPOSED SYSTEM Android is an operating system for mobile devices such as smartphones and tablet computers. It is developed by the Open Handset Alliance led by Google. Google purchased the initial developer of the software, Android Inc., in 2005. The unveiling of the Android distribution on November 5, 2007 was announced with the founding of the Open Handset Alliance, a consortium of 84 hardware, software, and telecommunication companies devoted to advancing open standards for mobile devices. Google released most of the Android code under the Apache License, a free software license. The Android Open Source Project (AOSP) is tasked with the maintenance and further development of Android. Android consists of a <u>kernel</u> based on the <u>Linux kernel</u>, with <u>middleware</u>, <u>libraries</u> and <u>APIs</u> written in <u>C</u> and <u>application software</u> running on an <u>application framework</u> which includes Java-compatible libraries based on <u>Apache Harmony</u>. Android uses the <u>Dalvik virtual machine</u> with <u>just-in-time compilation</u> to run Dalvik dex-code (Dalvik Executable), which is usually translated from <u>Java</u> bytecode. Android has a large community of developers writing applications ("apps") that extend the functionality of the devices. Developers write primarily in a customized version of Java. There are currently approximately 400,000 apps available for Android, from a total of 600,000 apps over the life of Android. Apps can be downloaded from third-party sites or through online stores such as <u>Android Market</u>, the app store run by Google. Android was listed as the best-selling smartphone platform worldwide in Q4 2010 by Canalys with over 200 million Android devices in use by November 2011 # V. ENCRYPTION ALGORITHM IMPLEMENTATION As we know, the efficiency of data encryption is very important. In the encryption system, the software is based on Java platform and the main work to do is the software designation based on Java/Embedded library. The application has asks user to enter destination number after that enter the text for encryption Algorithm 1: Encryption of Message - 1: for i ← 0 to arr.length do - 2: char a arr[i] - 3: int $x \leftarrow (int)a$ ; - 4: int y $\leftarrow$ x+3; - 5: char b $\leftarrow$ (char)y; - 6: $s \leftarrow s+b$ ; - 7: end for # VI. DECRYPTION ALGORITHM IMPLEMENTATION In this paper message is decrypted at ARM7, it is interfaced with GSM and LCD display. Embedded C is used for Decryption purpose .following algorithm 2 shows the logic of decryption Algorithm 2: Decryption of Message - 1: For i 4 o to str.length do - 2: Char a ← str[i] - 3: Int $x \leftarrow (int)a$ - 4: Int $y \leftarrow x-3$ - 5: Char b $\leftarrow$ (char)y - 6: $z \leftarrow z+b$ ; - 7: end for #### VII. CONCLUSIONS In this paper an encryption system based on android application is designed, which can encrypt the data and decryption system based on ARM7 is designed and realized in ARM7 16 bit THUMB instruction is used to reduce code density and increase the accuracy of results, which can decrypt the information. In order to improve the security of the private information , Own algorithms are designed for Encryption and Decryption, which are ASCII code based on character. Our proposed system is more securable compare to other standard algorithm of Encryption and Decryption since hackers don't anything about our own designed algorithm . In this paper decrypted message is displayed in LCD display at Receiver . #### REFERENCES - [1] A. Burnett, F. Byrne, T. Dowling, and A. Duffy. A Biometric Identity Based Signature Scheme. International Journal of Network Security, - [2] Chun Yuan, Yuzhou Zhong, and Yuwen He, "Chaos Based Encryption Algorithm for Compressed Video," Chinese Journal of Conputers, Vol.27 No.2, Feb 2004, pp.257-263. - [3] Yi Li, and Xingjiang Pan, "AES Based on Neural Network of Chaotic Encryption Algorithm," Science Technology and Engineering, Vol.10 No.29, Oct 2010. - [4] Ruxue Bai, Hongyan Liu, and Xinhe Zhang, "AES and its software implementation based on ARM920T," Journal of Computer Applications, Vol.31 No.5, May 2011, - http://wenku.baidu.com/view/5ebbd326ccbff121dd 36831a.html 187 - [6] D. Boneh and M. Franklin. Identity-Based Encryption from the Weil Pairing. In CRYPTO 2001, volume 2139 of Lecture Notes in Computer Science, pages 213{229. Springer Verlag, 2001. for Pairing-Based Cryptosystems. In CRYPTO 2002, volume 2442 ## **Audible Perception of Vision** #### S.V.Sastha Prashanth Electronics and Communication Engineering, Meenakshi Sundararajan Engineering College Chennai, Tamil Nadu, India. E-mail: sprashanthchennai90@gmail.com **Abstract** - This paper talks about an innovative solution to solve the problems faced by the visually impaired, by making them visually more independent. This technology helps them read any book and it also translates video images from a camera into audible sound. The presence of GPS helps them to find out their location at any instance. Key words: Image to Audio Convertor (IAC), Neural plasticity, Optical character recognition. #### I. INTRODUCTION Visual perception is the ability to interpret information and surroundings from the effects of visible light reaching the eye. It's ironical to note that some of our contemporaries are deprived of this wonderful ability of vision. This paper aims at bringing a new life to these people. The IAC is a device that Visual perception is the ability to interpret information and surroundings from the effects of visible light reaching the eye. It's ironical to note that some of our contemporaries are deprived of this wonderful ability of vision. This paper aims at bringing a new life to these people. The IAC is a device that consists of a camera which is attached to the glasses that the person wears, placed near the forehead. This in turn is connected to a portable embedded system, which processes the input image, and convertes it to sound that can be heard through an earphone. The IAC works in two main modes. In the EXAMINE mode, the processor converts the images to sound, thus leading to synthetic vision leading to sensations by exploiting the neural plasticity of the human brain through training. In the INTERPRET mode, the camera takes the picture of the text that is to be read converts the text to speech with the help of embedded technology. GPS is also attached to help them find out where they are and feedback is given in the form of audio. It gives them a sense of distance, direction, and visual perception, not just for a single object but also for multiple objects. Thus the IAC helps the visually impaired to read, and navigate without human intervention. #### PRESENT DAY TECHNOLOGIES Most visually challenged people use I-canes, that direct the person based on the obstacles found in the path. It sends out ultrasonic waves and based on the reflected wave the cane directs the person. One main disadvantage of these canes is that it blindly picks up a direction to avoid the obstacle which in turn might mislead the person. The primary advantage of Braille is that it allows users to read in a definite and preferred manner. However, they are large and unwieldy, that makes Braille books too cumbersome to store. Another problem is the limited number of books available in Braille. Neural implants are a breakthrough among the technologies invented for visually challenged. But there are many medical constrains and are very expensive. Moreover, it also involves a lot of risk as the life expectancy of the individual is at stake. Thus, none of these technologies provide an efficient solution to the problems faced by the visually impaired, and it causes more problems than it solves. #### MY IDEA My idea is to propose an innovative and simple solution to solve the problems faced by the visually challenged people. Using the IAC, they can read any book, provided the camera captures the page that is to be read. The text from the image is extracted and the corresponding ASCII codes are generated, which is converted to audio with the help of the inbuilt vocabulary to speech convertor. The IAC is attached with projections that help them to play, pause, move forward and back. Other than just reading the text, it also works in another mode known as the Examine mode, in which any picture is converted to its corresponding frequency range that is played in the earphone. The person is trained to these sounds and thus can figure out what his surrounding looks like. In short, the processor converts the images to sound, thus leading to synthetic vision, by exploiting the neural plasticity of the human brain through training. It scans each camera snapshot from left to right, sounds on the left or right respectively. The device can be muted at any instance, to hear external sounds. #### **HUMAN EYE AND THE BRAIN** The individual components of the eye work in a manner similar to a camera. Each part plays a vital role in providing clear vision, with the camera behaving much like a lens cover. As the eye's main focusing element, the cornea takes widely diverging rays of light and bends them through the pupil, the dark, round opening in the centre of the coloured iris. The iris and the pupil act like the aperture of a camera. Next in line is the lens which acts like the lens in a camera, helping to focus the light to the back of the eye. The very back of the eye is lined with a layer called retina which acts very much like the film of the camera. The retina is a membrane containing photoreceptor nerve cells that lines the inside back wall of the eye. The photoreceptor nerve cells of the retina change the light rays into electrical impulses and send them through the optic nerve to the brain, where an image is perceived. The centre 10% of the retina is called the mascula. This is responsible for sharp vision. The brain of the visually impaired people does not respond or detect the signals from the eye. The frame rate of the eye is high and the decoding process is complicated because of the damage to the mascula. The visually challenged people are more sensitive to touch and sound, as the brain does not have to spend time to the signals received by the eye and they pay more attention to other sensory organs. #### **OUTLINE MODEL OF THE DEVICE** As shown in the figure, the camera is attached to the glasses, and high power LEDs are also attached to enhance the images captured in dim light. This in turn is connected to a portable processor comprising of an embedded system that processes the necessary information from the input image. The processed data is converted to sound, which can be heard through earphones. The processor consists of 7 buttons to activate the following, which are INTERPRET mode, EXAMINE mode, FORWARD, PLAY/PAUSE, BACK, and MUTE. These buttons have projections, for the visually challenged to distinguish them . I can assure you that this model once comes into existence can change the world of the blind people. Figure 1: Outline of IAC #### INTERNAL ARCHITECTURE OF IAC The main blocks of the device are shown in the figure. The camera is used to capture images of the surroundings, which has special provisions for opening and closing the aperture to protect the CMOS sensors. This in turn is connected to the image processing unit, which has 2 main functions - control the aperture opening and histogram equalisation, so that the errors due to surface reflection can be avoided. The images are stored in the camera buffer and are sent to the data memory at an interval of 0.1 seconds by the time counter. The programmable Digital Signal Processor takes the input from the data memory, processes it and sends the processed signal to the respective blocks depending upon the mode selected. The DSP is interfaced with the program and the data memory, in which the routine for the modes loaded beforehand. The DAC unit converts digital signal sent by the processor, into a corresponding analog signals, which is amplified and then played in the earphones. An external volume control switch is present to control the volume of the amplifier. In the EXAMINE mode, the images to be extracted is stored in the data memory, and the images are transferred to the processor, which in turn processes the image with the help of the routines loaded in the program memory. The text is extracted and the corresponding ASCII values are stored in the data memory with the help of queue. Figure2: Proposed architecture of IAC When the play button is pressed, the signal is sent to the DSP, which in turn extracts the data from the data memory and sends it to the buffer. This buffer sends the data to the voice dictionary, which compares the ASCII values with the inbuilt vocabulary and plays the note if the code matches; else the word is spelt out. The mode selection unit is used to select the mode of operation of the IAC. The interrupt controller is used to control the working of the processor in interpret, examine or dormant mode (mute). The power section supplies the power for the whole circuit with the help of batteries. #### MATHEMATICAL MODEL OF EXAMINE MODE The IAC takes the gray scale image for analysis purpose. In gray scale images, 255 represents a white pixel and 0 represents a black pixel and intermediate values represent the percentage of black and white. For analysis purpose, the resolution is taken as 320 x 240. Each row is assigned with a frequency of 300 to 30000 Hz linearly. $$x_1+t(x_{-1})=300$$ at $x=1$ $x_1+t(x_{-1})=30000$ at $x=240$ By solving these simultaneous equations, we get, By substituting the values on 'n' in the place of x, we get the corresponding frequency for each row using this formula $$300 + 124.27 (x-1) = f$$ Assuming that the image stored is 'x', the value of each pixel is represented by x(m,n). If x(m,n) is 0, the corresponding amplitude is 0. If x(m,n) is 255, Then the amplitude is 20. By solving this, we get Thus the amplitude of the sine wave can be computed by the equation, $$A = (0.078432) * x$$ The frequency and amplitude that is found is substituted in $$S = A Sin [2 (pi) ft]$$ And thus a sine wave is plotted. This is done to obtain the Sine wave for each pixel. Each column has a particular time period, and the whole image is processed in 1 second. Thus for processing 1 column, it is 3.125 milliseconds. In order to have a spatial dimension, we have separate left and right channels. At m=0, that is leftmost end, the volume at the left ear is maximum and right ear is minimum and vice versa. So it is defined using an exponential function as shown, $$e^{-\alpha x} = 1$$ at $x = 1$ $e^{-\alpha x} = 0.1$ at $x = 320$ By solving these equations, we get, $$\alpha = 7.20 \times 10^{-3}$$ Left Amplitude = $A * e^{- \times X}$ Right Amplitude= A \* (1-- $\ell^{-\infty}$ ) #### **EXAMINE MODE ALGORITHM** The image is captured and stored in the data memory and the counter determines the frame rate of the camera. The image is analysed from bottom to top along the column, and from left to right. The pixels of the image are analysed and converted to a sine wave whose frequency and amplitude are determined by the position and value of the pixel. Similarly a sine wave for each column is generated and summed to form a vector. The line period for each column is 3.125 milliseconds and thus the whole picture is 1 second. Both the channels have to coordinate simultaneously. Thus when the amplitude increases on the left earplug, the corresponding amplitude decreases on the right earplug, and vice versa. #### NEED FOR EXAMINE MODE IN IAC Though we seldom think about it, sighted individuals are continually bombarded everyday by the printed world. Some of the sources of this abundance of printed media in our environment include transportation, advertising, news, and commercial signs. However, this is a phenomenon that people who are blind currently do not experiment. Most of their vision troubles prevent them from having access to textual information. Even the process of eating out is complicated by the fact that few restaurants have menus in Braille. All of these facts underscore the necessity for a portable, small size and easy to use automatic text reader. With the emergence of multimedia technology. Figure 3: Examine mode representation and powerful mobile devices, it is now possible to imagine an inexpensive system able to capture images in real time and transform image into speech information. Optical Character Recognition (OCR) is currently developing algorithms to characterize the visual content of images and to recognize text. My objective is to make these algorithms working in real time into a device devoted to helping people who are blind or visually impaired. While several devices have been developed in the past to assist the reading of printed text, they have all fallen short of the user expectations. Most have been too cumbersome or not readily available to be practical and truly portable. Sometimes they even create more problems than they solve #### FLOW CHART OF EXAMINE MODE Figure 4: Flow chart of Examine mode #### THE GLOBAL POSITIONING SYSTEM The Global Positioning System (GPS) is a satellite-based navigation system made up of a network of 24 satellites placed into orbit by the U.S. Department of Defense. GPS was originally intended for military applications, but in the 1980s, the government made the system available for civilian use. GPS works in any weather conditions, anywhere in the world, 24 hours a day. There are no subscription fees or setup charges to use GPS. A GPS receiver must be locked on to the signal of at least three satellites to calculate a 2D position (latitude and longitude) and track movement. With four or more satellites in view, the receiver can determine the user's 3D position (latitude, longitude and altitude). Once the user's position has been determined, the GPS unit can calculate other information, such as speed, bearing, track, trip distance, distance to destination, sunrise and sunset time and more. Today's GPS receivers are extremely accurate and easy to use and user friendly. #### **FUTURE ENHANCEMENTS** The IAC is believed to shower light in the form of sound on the lives of many thousands of our fellow beings, who spend their lives in darkness. Compared to the existing technolo-gies, the IAC is much cheaper and efficient. Many improvements can be brought about in order to overcome some of the minor drawbacks of this device. Some of the drawbacks are listed below: - Reading Newspapers and Magazines the IAC is designed to read only printed text that is of 1 column. - Use of IR cameras With a normal camera images taken during the night and fogged conditions are not clear. With the help of an IR camera the pictures will be clearer, thus analysis will be more accurate. - Examine mode in various languages The data fed in the DSP can be done in various languages and selection of languages can also be an option. #### CONCLUSION The IAC is hoped that producing sounds from images may lead to visual experiences, which truly have the feel of vision. It can be used to build a more complete mental map of the environment. It gives them a sense of distance, direction, and visual perception for multiple objects and landmarks that make up the surrounding environment. Thus the practical implication of the IAC would expose the visually challenged to the real world around them making them more confident and independent. # **BIBLIOGRAPHY** - [1] S.Ferreira, C.Thillou, B.Gosselin, "From Picture to Speech: an innovative application for embedded environment" - [2] "An obstacle segmentation and classification system for the visually impaired", Callaghan and Mahony Dept. of Mech. Eng., Cork Inst. of Technol., Cork, Ireland - [3] The metamodal organization of the brain C.Casanova and M.Ptito Progress in Brain Research - [4] V.Wu, R.Manmatha, "Textfinder:an automatic system to detect and recognize text in images" - [5] Miriam Leon and Antoni Gasull, "Text Detection in images and Video Sequences". - [6] Hae Yong Kim, "Segmentation free Printed Character Recognition by Relaxed Nearest Neighbour Learning of Windowed Operator". - [7] "Digital Image Processing Using MATLAB", by Rafael C.Gonzalez, Richard E.Woods, Steven.L.Eddins - [8] Gonzalez, Woods, "Digital Image Processing", 1994. # Peak and Average Power Reduction Technique For Digital Circuits Using Scan-Based BIST #### B.Srinivasa Rao & B.Kondalu ECE Department, ASR COLLEGE OF ENGINEERING, TANUKU, INDIA. E-Mail: kondalu ec037@yahoo.co.in ,vasu5717@gmail.com Abstract - Technology provides smaller, faster and lower energy devices which allow more powerful and compact circuitry. Thermal and shot-noise estimations alone suggest that the fault rate of an individual nanoscale device may be orders of magnitude higher than today's devices. This work provides combinational logic to be susceptible to faults. So, in order to test any circuit or device this work requires separate testing technique which should be done automatically. For that purpose, going for Built in self test (BIST). (BIST), test patterns are generated and applied to the circuit-under-test (CUT) by on-chip hardware and minimizing hardware overhead is a major concern of BIST implementation. This paper presents a low hardware overhead test pattern generator (TPG) for scan-based built-in self-test (BIST)that can reduce switching activity in circuits under test (CUTs)during BIST and also achieve very high fault coverage with reasonable lengths of test sequences. The proposed BIST TPG decreases transitions that occur at scan inputs during scan shift operations and hence reduces switching activity in the CUT. Test patterns generated by the LT-RTPG detect easy-to-detect faults and test patterns generated by the 3-weight WRBIST detect faults that remain undetected after LT-RTPG patterns are applied. The proposed BIST TPG does not require modification of mission logics, which can lead to performance degradation. Experimental results for ISCAS'89 benchmark circuits demonstrate that the proposed BIST can significantly reduce switching activity during BIST while achieving 100% fault coverage for all ISCAS'89 benchmark circuits. Larger reduction in switching activity is achieved in large circuits. Experimental results also show that the proposed BIST can be implemented with low area overhead. **Keywords**— BIST, Bit-swapping LFSR, scan-chain-ordering, Hamming Distance, Fault Coverage, Low power testing, power dissipation during test application, random pattern testing. # I. INTRODUCTION In recent years, the design for low power has become one of the greatest challenges in high-performance very large scale integration (VLSI) design. As a consequence, many techniques have been introduced to minimize the power consumption of new VLSI systems. However, most of these methods focus on the power consumption during normal mode operation, while test mode operation has not normally been a predominant concern. However, it has been found that the power consumed during test mode operation is often much higher than during normal mode operation [1]. This is because most of the consumed power results from the switching activity in the nodes of the circuit under test (CUT), which is much higher during test mode than during normal mode operation [1]–[3]. Several techniques that have been developed to reduce the peak and average power dissipated during scan-based tests can be found in [4] and [5]. A direct technique to reduce power consumption is by running the test at a slower frequency than that in normal mode. This technique of reducing power consumption, while easy to implement, significantly increases the test application time [6]. Furthermore, it fails in reducing peak-power consumption since it is independent of clock frequency. Another category of techniques used to reduce the power con-sumption in scan-based built-in self-tests (BISTs) is by using scan-chain-ordering techniques [7]–[13]. These techniques aim to reduce the average-power consumption when scanning in test vectors and scanning out captured responses. Although these algorithms aim to reduce average-power consumption, they can reduce the peak power that may occur in the CUT during the scanning cycles, but not the capture power that may result during the test cycle (i.e., between launch and capture). The design of low-transition test-pattern generators (TPGs) is one of the most common and efficient techniques for low-power tests [14]–[20]. These algorithms modify the test vectors generated by the LFSR to get test vectors with a low number of transitions. The main drawback of these algorithms is that they aim only to reduce the average-power consumption while loading a new test vector, and they ignore the power consumption that results while scanning out the captured response or during the test cycle. Furthermore, some of these techniques may result in lower fault coverage and higher test-application time. Other techniques to reduce average-power consumption during scan-based tests include scan segmentation into multiple scan chains [6], [21], test-scheduling techniques [22], [23], static-compaction techniques [24], and multiple scan chains with many scan enable inputs to activate one scan chain at a time [25]. The latter technique also reduces the peak power in the CUT. On the other hand, in addition to the techniques mentioned earlier, there are some new approaches that aim to reduce peak-power con- sumption during tests, particularly the capture power in the test cycle. One of the common techniques for this purpose is to modify patterns using an X-filling technique to assign values to the don't care bits of a deterministic set of test vectors in such a way as to reduce the peak power in the test vectors that have a peak-power violation [26]–[29]. This paper presents a new TPG, called the bit-swapping linear feedback shift register (BS-LFSR), that is based on a simple bit-swapping technique applied to the output sequence of a conventional LFSR and designed using a conventional LFSR and a 2 × 1 multi- plexer. The proposed BS-LFSR reduces the average and instantaneous weighted switching activity (WSA) during test operation by reducing the number of transitions in the scan input of the CUT. The BS- LFSR is combined with a scan-chain-ordering algorithm that reduces the switching activity in both the test cycle (capture power) and the scanning cycles (scanning power). # II. PROPOSED APPROACH TO DESIGN THE BS-LFSR The proposed BS-LFSR for test-per-scan BISTs is based upon some new observations concerning the number of transitions produced at the output of an LFSR. Definition: Two cells in an *n*-bit LFSR are considered to be adja- cent if the output of one cell feeds the input of the second directly (i.e., without an intervening XOR gate). Lemma 1: Each cell in a maximal-length n-stage LFSR (internal or external) will produce a number of transitions equal to $2^{n-1}$ after going through a sequence of $2^n$ clock cycles. *Proof:* The sequence of 1s and 0s that is followed by one bit position of a maximal-length LFSR is commonly referred to as an *m*- sequence. Each bit within the LFSR will follow the same *m*-sequence with a one-time-step delay. The *m*-sequence generated by an LFSR of length n has a periodicity of $2^n - 1$ . It is a well- known standard property of an *m*-sequence of length n that the total number of runs of consecutive occurrences of the same binary digit is $2^{n-1}$ [3], [30]. The beginning of each run is marked by a transition. Fig. 1. Swapping arrangement for an LFSR. Fig. 2. External LFSR that implements the prime polynomial $x^n + x + 1$ and the proposed swapping arrangement. $2^{n-1}$ . This lemma can be proved by using the toggle property of the XOR gates used in the feedback of the LFSR [32]. Lemma 2: Consider a maximal-length n-stage internal or external LFSR (n > 2). We choose one of the cells and swap its value with its adjacent cell if the current value of a third cell in the LFSR is 0 (or 1) and leave the cells unswapped if the third cell has a value of 1 (or 0). Fig. 1 shows this arrangement for an external LFSR (the same is valid for an internal LFSR). In this arrangement, the output of the two cells will have its transition count reduced by $T_{\rm Saved} = 2(n-2)$ transitions. Since the two cells originally produce $2 \times 2^{n-1}$ transitions, then the resulting percentage saving is $T_{\rm Saved}\% = 25\%$ [32]. In Lemma 2, the total percentage of transition savings after swap- ping is 25% [31]. In the case where cell x is not directly linked to cell m or cell m +1 through an XOR gate, each of the cells has the same share of savings (i.e., 25%). Lemmas 3–10 show the special cases where the cell that drives the selection line is linked to one of the swapped cells through an XOR gate. In these configurations, a single cell can save 50% transitions that were originally produced by an LFSR cell. Lemma 3 and its proof are given; other lemmas can be proved in the same way. Lemma 3: For an external n-bit maximal-length LFSR that imple- ments the prime polynomial $x^n + x + 1$ as shown in Fig. 2, if the first two cells (c1 and c2) have been chosen for swapping and cell n as a selection line, then o2 (the output of MUX2) will produce a total transition savings of $2^{n-2}$ compared to the number of transitions produced by each LFSR cell, while o1 has no savings (i.e., the savings in transitions is concentrated in one multiplexer output, which means that o2 will save 50% of the original transitions produced by each LFSR cell). *Proof*: There are eight possible combinations for the initial state of the cells $c_1$ , $c_2$ , and $c_n$ . If we then consider all possible values of the following state, we have two possible combinations (not eight, because the value of $c_2$ in the next state is determined by the value of $c_1$ in the present state; also, the value of $c_1$ in the next state is determined by between 0 and 1; therefore, the total number of transitions for each stage of the LFSR is TABLE-1 Possible and Subsequent States for Cells $c_1$ , $c_2$ , and $c_n$ (See Fig. 2) | LFSR outputs of m, m+1 | | | | | | | | Multiplexers outputs O <sub>1</sub> , O <sub>2</sub> | | | | | | | | |------------------------|--------|---------------|----|-------------|---------------|-----|-----|------------------------------------------------------|----------|-----|-------------|----|------------|-------|------| | | States | } | Ne | Next states | | | ast | ion | sta | tes | Next States | | transition | | Cin. | | cI | C2 | Cn | CI | C2 | Ca | ឲ្យ | O2 | Σ | Oı | 02 | Oı | Oz | Oı | $O_2$ | Σ | | 0 | a | a | 0 | 0 | 0 | 0 | 0 | 0 | 0 | a | 0 | 0 | 0 | 0 | 0 | | ٣ | ٦ | ٦ | 0 | 0 | 1 | 0 | 0 | 0 | U | ۳ | 0 | 0 | 0 | 0 | 0 | | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | | Ľ | u | | 1 | 0 | 1 | 1 | 0 | 1 | ل ا | v | 1 | 0 | 1 | 0 | 1 | | 0 | | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | | " | * | | 0 | 0 | 1 | 0 | 1 | 1 | | l u | 0 | 0 | 1 | 0 | 1 | | 0 | 1 | 1 1 0 0 1 1 2 | 2 | 0 1 | 0 | 1 | 0 | 0 | 0 | | | | | | | | | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 2 | ۵ | 1 | 1 | 0 | 1 | 1 | 2 | | | 0 | 0 | 1 | 1 | 0 | 0 | | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | | l • I | 0 | | 1 | 1 | 1 | 0 | | 1 | ויין | 1 | 1 | 1 | 1 | 0 | 1 | | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | | ' | ١ ٧ | | 0 | 1 | I | 1 | 1 | 2 | * | v | 0 | 1 | 1 | 1 | 2 | | | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | | | | ٠ | 1 | 1 | 1 | 0 | 0 | 0 | * | 1 | 1 | 1 | 0 | 0 | 0 | | 1 | 1 | 1 | 0 | 1 | 1 0 1 0 1 1 1 | 1 | 1 | 0 | 0 | 1 | 1 | | | | | | Ľ | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | <b>L</b> | | 0 | 1 | 1 | 0 | 1 | | ΣTr | أأوسي | يسبأ | | | | 8 | 8 | 16 | | | | | 8 | 4 | 12 | " $c_{1c}$ xor $c_n$ " in the present state). Table I shows all possible and subsequent states. It is important to note that the overall savings of 25% is not equally distributed between the outputs of the multiplexers as in Lemma 2. This is because the value of $c_1$ in the present state will affect the value of $c_2$ and its own value in the next state (c2(Next) = c1and c1(Next) = c1 xor cn"). To see the effect of each cell in transition savings, Table I shows that o1 will save one transition when moving from state (0,0,1) to (1,0,0), from (0,1,1) to (1,0,0), from (1,0,1) to (0,1,0), or from (1,1,1) to (0,1,0). In the same time, $o_1$ will increase one transition when moving from (0,1,0) to (0,0,0), from (0,1,0) to (0,0,1), from (1,0,0) to (1,1,0), or from (1,0,0)to (1,1,1). Since $o_1$ increases the transitions in four possible scenarios and save transitions in other four scenarios, then it has a neutral overall effect because all the scenarios have the same probabilities. For $o_2$ , one transition is saved when moving from (0,1,0) to (0,0,0), from (0,1,0) to (0,0,1), from (0,1,1) to (1,0,0), from (1,0,0) to (1,1,0), from (1,0,0) to (1,1,1), or from (1,0,1) to (0,1,0). At the same time, one additional transition is incurred when moving from state (0,0,1) to (1,0,0) or from (1,1,1) to (0,1,0). This gives $o_2$ an overall saving of one transition in four possible scenarios where the initial states has a probability of 1/8 and the final states of probability 1/2; hence, $P_{\rm Save}$ is given by $$P_{\text{save}} = 1/8 \times 1/2 + 1/8 \times 1/2 + 1/8 \times 1/2 + 1/8 \times 1/2 = 1/4.$$ (1) If the LFSR is allowed to move through a complete cycle of $2^n$ states, then Lemma 1 shows that the number of transitions expected to occur in the cell under consideration is $2^{n-1}$ . Using the swapping approach, in 1/4 of the cases, a saving of one transition will occur, giving a total saving of $1/4 \times 2^n = 2^{n-2}$ . Dividing one figure by the other, we see that the total number of transitions saved at $o_2$ is 50%. In the special configurations shown in Table II (i.e. Lemmas 3–10), if the cell that saves 50% of the transitions is connected to feed the scan-chain input, then it saves 50% of the transitions inside the scan-chain cells, which directly reduces the average power and also the peak power that may result while scanning in a new test vector. Table III shows that there are 104 LFSRs (internal and external) whose sizes lie in the range of 3–168 stages that can be configured to satisfy one or more of the special cases in Table II to concentrate the transition savings in one multiplexer output. TABLE II SPECIAL CASES WHERE ONE CELL SAVES 50% OF THE TRANSITIONS | T. a.m | LFSR | LEGR | Swapp | ei ælk | Selection | MUX Out | |-----------|------------------------------------------------------------------------------------------------|----------|------------------|----------------|------------------|----------------| | Lonnas | Polynomial | Туре | I <sup>at</sup> | 2 | Line | 50% Save | | Lerama 3 | x"+x+1 | External | G | C <sub>2</sub> | C <sub>z</sub> | O <sub>2</sub> | | Lemma 4 | x*+x+1 | Internal | Cı | C <sub>r</sub> | C <sub>2</sub> | Oz | | Leaning 5 | x*+x*-1+1 | External | Ca-i | Cs | $\mathbf{C}_1$ | $O_1$ | | Lemma 6 | Z <sub>0</sub> +Z <sub>2-1</sub> +3 | Internal | Cı | Cg | C <sub>n-1</sub> | Oı | | Lexens 7 | x*+x2+1 | External | CL | Cz | Cz | Oı | | Leroma 8 | Ze+Z=3+1 | Internal | C <sub>e-1</sub> | Cz | C <sub>n-2</sub> | Oı | | Lemma 9 | x <sup>2</sup> + x <sup>2-1</sup> + x <sup>24</sup> +<br>+x <sup>23</sup> + x <sup>21</sup> +1 | Internal | Cı | C <sub>2</sub> | C <sub>n-1</sub> | Oı | | Lemma 10 | x"+x"2+x"1+1 | Internal | C <sub>n-1</sub> | C <sub>n</sub> | C <sub>n-2</sub> | O <sub>1</sub> | | of LPSR | LPSR settle one or more | |---------------|----------------------------------------------------------------------| | Stages | of Lemmas 3 to 10 in table 2 | | 3-20 | 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 15, 16, 19 | | 21-40 | 21, 22, 24, 26, 27, 29, 30, 32, 34, 35, 37, 38, 40 | | 41-60 | 42, 43, 44, 45, 46, 48, 50, 51, 53, 54, 56, 59, 60 | | 61-80 | 61, 62, 63, 64, 66, 67, 69, 70, 74, 75, 76, 77, 78, 80 | | <b>81-100</b> | #3, 85, 86, #4, 90, 91, 92, 93, 96, 99 | | 101-120 | 101, 102, 104, 107, 109, 110, 112, 114, 115, 116, 117 | | 121-140 | 122, 123, 125, 126, 127, 128, 131, 133, 136, 138 | | 141-160 | 141, 143, 144, 146, 147, 149, 152, 153, 154, 155, 156, 157, 158, 160 | | 161-168 | 162, 163, 164, 165, 166, 168 | | Total | 184 | # III. IMPORTANT PROPERTIES OF T HE BS-LFSR There are some important features of the proposed BS-LFSR that make it equivalent to a conventional LFSR. The most important properties of the BS-LFSR are the following. - 1) The proposed BS-LFSR generates the same number of 1s and 0s at the output of multiplexers after swapping of two adjacent cells; hence, the probabilities of having a 0 or 1 at a certain cell of the scan chain before applying the test vectors are equal. Hence, the proposed design retains an important feature of any random TPG. Furthermore, the output of the multiplexer depends on three different cells of the LFSR, each of which contains a pseudorandom value. Hence, the expected value at the output can also be considered to be a pseudorandom value. - 2) If the BS-LFSR is used to generate test patterns for either test-per-clock BIST or for the primary inputs of a scan-based sequen- tial circuit (assuming that they are directly accessible) as shown in Fig. 3, then consider the case that $c_1$ will be swapped with $c_2$ and $c_3$ with $c_4$ ,..., $c_{n-2}$ with $c_{n-1}$ according to the value of $c_n$ which is connected to the selection line of the multiplexers (see Fig. 3). In this case, we have the same exhaustive set of test vectors as would be generated by the conventional LFSR, but their order will be different and the overall transitions in the primary inputs of the CUT will be reduced by 25% [32]. #### IV. CELL REORDERING LGORITHM Although the proposed BS-LFSR can achieve good results in re-ducing the consumption of average power during test and also in minimizing the peak power that may result while scanning a new test vector, it cannot reduce the overall peak power because there are some components that occur while scanning out the captured response or while applying a test vector and capturing a response in the test cycle. To solve these problems, first, the proposed BS-LFSR has been combined with a cell-ordering algorithm presented in [11] that reduces the number of transitions in the scan chain while scanning out the Fig. 3. BS-LFSR can be used to generate exhaustive patterns for test-per- clock BIST. Captured response. This will reduce the overall average power and also the peak power that may arise while scanning out a captured response. The problem of the capture power (peak power in the test cycle) will be solved by using a novel algorithm that will reorder some cells in the scan chain in such a way that minimizes the Hamming distance between the applied test vector and the captured response in the test cycle, hence reducing the test cycle peak power (capture power). In this scan-chain-ordering algorithm, some cells of the ordered scan chain using the algorithm in [11] will be reordered again in order to reduce the peak power which may result during the test cycle. This phase mainly depends on an important property of the BS-LFSR. This property states that, if two cells are connected with each other, then the probability that they have the same value at any clock cycle is 0.75. (In a conventional LFSR where the transition probability is 0.5, two adjacent cells will have the same value in 50% of the clocks and different values in 50% of the clocks; for a BS-LFSR that reduces the number of transition of an LFSR by 50%, the transition probability is 0.25, and hence, two adjacent cells will have the same value in 75% of the clock cycles.) Thus, for two connected cells (cells j and k), if we apply a sufficient number of test vectors to the CUT, then the values of cells j and k are similar in 75% of the applied vectors. Hence, assume that we have cell x which is a function of cells y and z. If the value that cell x will have in the captured response is the same as its value in the applied test vector (i.e., no transition will happen for this cell in the test cycle) in the majority of cases where cells y and z have the same value, then we connect cells y and z together on the scan chain, since they will have the same value in 75% of the cases. This reduces the possibility that cell x will undergo a transition in the test cycle. The steps in this algorithm are as follows. - 1) Simulate the CUT for the test patternsgenerated by the BS-LFSR. - Identify the group of vectors and responses that violate the peak power. - In these vectors, identify the cells that mostly change their values in the test cycle and cause the peak-power violation. - 4) For each cell found in step 3), identify the cells that play the key role in the value of this cell in the test cycle. - 5) If it is found that, when two cells have a similar value in the applied test vector, the concerned cell will most probably have no transition in the test cycle, then connect these cells together. If it is found that, when two cells have a different value, the cell under consideration will most probably have no transitions in the test cycle, then connect these cells together through an inverter. It is important to note that this phase of ordering is done when necessary only, as stated in step 2 of the algorithm description that the group of test vectors that violates the peak power should be identified first. Hence, if no vector violates the peak power, then this phase will TABLE IV TEST LENGTH NEEDED TO GET TARGET FAULT COVERAGE FOR LFSR AND BS-LFSR | | | | Test Length | | | | | | | |---------|----|------|-------------|-------|---------|------|--------|----------|------------| | Circuit | 0 | n | PI | RF% | % PC% . | Det. | LESR | BS-LF5R | BS-LFSR | | | | | | | i | | 12/38 | no order | with order | | 3641 | 32 | 19 | 35 | 0 | 98.0 | 53 | 5120 | 4910 | 4970 | | S838 | 32 | 32 | 35 | 0 | 86.5 | 90 | 816D | B460 | 7930 | | S1196 | 30 | 18 | 14 | | 97.0 | 131 | 3750 | 3690 | 3370 | | S1238 | 30 | 18 | 14 | 5.09 | 91.3 | 141 | 3890 | 3560 | 3610 | | S5378 | 40 | 179 | 35 | 0.88 | 98.0 | 244 | 30110 | 33700 | 28900 | | S9234 | 40 | 228 | 19 | 6.52 | 90.0 | 367 | 397800 | 401930 | 398170 | | S13207 | 60 | 669 | 31 | 1.54 | 95.0 | 455 | 49660 | 47400 | 48110 | | \$35932 | 64 | 1728 | 35 | 10.19 | 89.8 | 63 | 18700 | 16640 | 16520 | | S38417 | 64 | 1636 | 28 | 0.53 | 96.5 | 349 | 118580 | 125520 | 117090 | | S38584 | 64 | 1452 | 12 | 4.15 | 94 | 632 | 43530 | 39660 | 40090 | TABLE V EXPERIMENTA L RESULTS OF AVERAGEAND PEAK-POWER REDUCTION OBTAINED BY USING THE PROPOSED TECHNIQUES | Circuit | TL | LFSR | | | | S-LFSR v<br>æll orderi | %Savings of<br>BS-LPSR | | | |---------|--------|-------|----------|-------|-------|------------------------|------------------------|------|-----| | | | PC% | WSA | WSApt | PC% | WSA | WSA | WSA. | WSA | | S641 | 3000 | 97.84 | 97.78 | 153 | 97.54 | 42.20 | 84 | 57 | 45 | | S838 | 20000 | 96.15 | 81.91 | 151 | 96,21 | 33,14 | 83 | -60 | 45 | | S1196 | 2000 | 95.33 | 53.18 | 74 | 95.51 | 21.52 | 42 | 60 | 43 | | 31238 | 3000 | 91.11 | 61,20 | 97 | 90.97 | 34.80 | .59 | 43 | 39 | | \$5378 | 40000 | 98,42 | 1143,24 | 1639 | 98,40 | 625.28 | 993 | 45 | 39 | | 39234 | 100000 | 87.27 | 2817.45 | 3988 | 87.28 | 1108.93 | 2197 | 63 | 45 | | S13207 | 100000 | 96.45 | 4611.67 | 7108 | 96,39 | 1897.33 | 4172 | 59 | 41 | | \$33932 | 200 | 87.88 | 7945.81 | 12592 | 87.89 | 2793.16 | 5723 | 65 | 55 | | S38417 | 100000 | 95.73 | 10965.50 | 16380 | 95.68 | 5022.30 | 10017 | 54 | 39 | | S38584 | 100000 | 94.46 | 11194.65 | 15974 | 94.48 | 5682.72 | 7851 | 49 | 51 | not be done. In the worst case, this phase is performed in few subsets of the cells. This is because, if this phase of ordering is done in all cells of the scan chain, then it will destroy the effect of algorithm found in [11] and will substantially increase the computation time. #### V. EXPERIMENTAL RESULTS A group of experiments was performed on fullscan ISCAS'89 benchmark circuits. In the first set of experiments, the BS-LFSR is evaluated regarding the length of the test sequence needed to achieve certain fault coverage with and without the scan-chainordering algorithm. Table IV shows the results for a set of ten benchmark circuits. The columns labeled n, m, and PI refer to the sizes of the LFSR, the number of flip-flops in the scan chain, and the number of primary inputs of the CUT, respectively. The column labeled RF indicates the percentage of redundant faults in the CUT, and fault coverage (FC) indicates the target fault coverage where redundant faults are included. The last four columns show the test length needed by a deterministic test (i.e., the optimal test vector set is stored in a ROM), a conventional LFSR, a BS-LFSR with no scan-chain ordering, and the BS-LFSR with scan-chain ordering, respectively. The results in Table IV show that the BS-LFSR needs a shorter test length than a conventional LFSR for many circuits even without using the scan- chain-ordering algorithm. It also shows that using the scan-chainordering algorithm with BS-LFSR will shorten the required test length. The second set of experiments is used to evaluate the BS-LFSR together with the proposed scan-chain-ordering algorithm in reduce- ing average and peak power. For each benchmark circuit, the same numbers of conventional LFSR and BS-LFSR patterns are applied to the full scan configuration. Table V shows the obtained results for the same benchmark circuits as in Table IV. The column labeled test length (TL) refers to the number of test vectors applied to the CUT. The next three columns show the FC, average WSA per clock cycle. TABLE VI Compa Rison With Results Obtained IN [15] | Circuit | Re | sults in | [15] | Results of proposed method | | | | |---------|--------|----------|--------|----------------------------|-------|--------|--| | Circuit | TL | FC | %WSAav | TL | FC | %WSAav | | | S641 | 4096 | 97.21 | 38 | 3000 | 97.54 | 57 | | | S838 | 4096 | 95.46 | 50 | 20000 | 96.21 | 60 | | | S1196 | 4096 | 95.59 | 17 | 2000 | 95.51 | 60 | | | S1238 | 4096 | 89.41 | 17 | 3000 | 90.97 | 43 | | | S5378 | 65536 | 96.54 | 43 | 40000 | 98.40 | 45 | | | S9234 | 524288 | 90.89 | 62 | 100000 | 87.28 | 61 | | | S13207 | 132072 | 93.66 | 45 | 100000 | 96.39 | 59 | | | S35932 | 128 | 87.84 | 56 | 200 | 87.89 | 65 | | | S38417 | 132072 | 94.99 | 56 | 100000 | 95.68 | 54 | | | S38584 | 132072 | 93.35 | 59 | 100000 | 94.48 | 49 | | | AVG | 100255 | 93.49 | 44 | 46820 | 94.04 | 55 | | TABLE VII COMPARISON OF PEAK-POWER REDUCTIONS WITH RESULTS IN [25] | Circuit | Results in [25]<br>WSA <sub>pk</sub> Savings % | Proposed Method<br>WSA <sub>pk</sub> Savings % | |---------|------------------------------------------------|------------------------------------------------| | \$5378 | 36.6 | 39 | | S9234 | 38.9 | 45 | | S13207 | 46.1 | 41 | | S38417 | 40.1 | 39 | | S38584 | 35.9 | 51 | | AVG | 39.5 | 43.0 | (WSA<sub>avg</sub>), and the maximum WSA in a clock cycle (WSA<sub>peak</sub>) for patterns applied using the conventional LFSR. The next three columns show FC, WSA<sub>avg</sub>, and WSA<sub>peak</sub> for the BS-LFSR with ordered scan chain. Finally, the last two columns show the savings in average and peak power by using the BS-LFSR with the scan-chain-ordering algorithm. In order to provide a comparison with the techniques published previously by other authors, Table VI compares the results obtained by the proposed technique with those obtained in [15]. Table VI compares the TL, FC, and average-power reduction (WSA $_{\rm avg}$ ). It is clear that the proposed method is much better for most of the circuits, not only in average-power reduction but also in the test length needed to obtain good fault coverage. Finally, Table VII compares the results obtained by the proposed technique for peak-power reduction with those obtained in [25]. It is clear from the table that the proposed method has better results for most of the benchmark circuits. ## VI. CONCLUSION AND FUTURE WORK In this paper is concluded and explained the scope to extend the work in future. A low transition TPG that is based on some observations about transition counts at the output sequence of LFSRs has been presented. The proposed TPG is used to generate test vectors for test per scan BISTs in order to reduce the switching activity while scanning test vectors into the scan chain. Furthermore, a novel algorithm for scan chain ordering has been presented. When the BS LFSR is used together with the proposed scan chain ordering algorithm, the average and peak powers are reduced. The effect of the proposed design in the fault coverage, test application time, and hardware area overhead is negligible. Comparisons between the proposed design and other previously published methods show that the proposed design can achieve better results for most tested benchmark circuits The future enhancement of the FPGA implementation of Bit Swapping LFSR along with Scan Chain Ordering is to add a block which identifies the component with fault. It not only identifies the fault in the system but also shows where the fault has occurred #### REFERENCES - [1] Y. Zorian, "A distributed BIST control scheme for complex VLSI devices," in *Proc.* 11th IEEE VTS, Apr. 1993, pp. 4–9. - [2] A. Hertwig and H. J. Wunderlich, "Low power serial built-in self-test," in *Proc. IEEE Eur. Test Workshop*, May 1998, pp. 49–53. - [3] P. H. Bardell, W. H. McAnney, and J. Savir, Built-in Test for VLSI: Pseudorandom Techniques. New York: Wiley, 1997 - [4] P. Girard, "Survey of low-power testing of VLSI circuits," *IEEE Des. Test Comput.*, vol. 19, no. 3, pp. 80–90, May/Jun. 2002. - [5] K. M. Butler, J. Saxena, T. Fryars, G. Hetherington, A. Jain, and J. Lewis, "Minimizing power consumption in scan testing: Pattern generation and DFT techniques," in *Proc. Int. Test Conf.*, 2004, pp. 355–364. - [6] J. Saxena, K. Butler, and L. Whetsel, "An analysis of power reduction techniques in scan testing," in *Proc. Int. Test Conf.*, 2001, pp. 670– 677. - [7] V. Dabhholkar, S. Chakravarty, I. Pomeranz, and S. M. Reddy, "Tech niques for minimizing power dissipation in scan and combinational circuits during test applications," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 17, no. 12, pp. 1325–1333, Dec. 1998. - [8] Y. Bonhomme, P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, and V. Virazel, - "Design of routing-constrained low power scan chains," in *Proc. Des. Autom. Test Eur. Conf. Exhib.*, Feb. 2004, pp. 62–67. - [9] W. Tseng, "Scan chain ordering technique for switching activity reduction during scan test," *Proc. Inst. Elect. Eng.—Comput. Digit. Tech.*, vol. 152, no. 5, pp. 609–617, Sep. 2005. - [10] C. Giri, B. Kumar, and S. Chattopadhyay, "Scan flip-flop ordering with delay and power minimization during testing," in *Proc. Annu. IEEE INDICON*, Dec. 2005, pp. 467–471. - [11] Y. Bonhomme, P. Girard, C. Laundrault, and S. Pravossoudovitch, "Power driven chaining of flip-flops in scan architectures," in *Proc. Int. Test Conf.*, Oct. 2002, pp. 796–803. - [12] M. Bellos, D. Bakalis, and D. Nikolos, "Scan cell ordering for low power BIST," in *Proc. IEEE Comput. Soc. Annu. Symp. VLSI*, Feb. 2004, pp. 281–284. - [13] K. V.A. Reddy and S. Chattopadahyay, "An efficient algorithm to reduce test power consumption by scan cell and scan vector reordering," in *Proc. IEEE 1st India Annu. Conf. INDICON*, Dec. 2004, pp. 373–376. - [14] S. Wang, "A BIST TPG for low power dissipation and high fault cover- age," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 15, no. 7, pp. 777–789, Jul. 2007. - [15] S. Wang and S. Gupta, "LT-RTPG: A new test-per-scan BIST TPG for low switching activity," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 25, no. 8, pp. 1565–1574, Aug. 2006. - [16] S. Wang and S. K. Gupta, "DS-LFSR: A BIST TPG for low switching activity," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 21, no. 7, pp. 842–851, Jul. 2002. - [17] H. Ronghui, L. Xiaowei, and G. Yunzhan, "A low power BISTTPG design," in *Proc. 5th Int. Conf. ASIC*, Oct. 2003, vol. 2, pp. 1136–1139. [18] L. Jie, Y. Jun, L. Rui, and W. Chao, "A new BIST structure for low power testing," in *Proc. 5th Int. Conf. ASIC*, Oct. 2003, vol. 2, pp. 1183–1185. - [19] M. Tehranipoor, M. Nourani, and N. Ahmed, "Low transition LFSR for BIST-based applications," in *Proc. 14th ATS*, Dec. 2005, pp. 138–143. - [20] I. Pomeranz and S. M. Reddy, "Scan-BIST based on transition probabilities for circuits with single and multiple scan chains," *IEEE Trans. Comput.-Aided Design Integr. Circuits* - Syst., vol. 25, no. 3, pp. 591–596, Mar. 2006. - [21] N. Nicolici and B. Al-Hashimi, "Multiple scan chains for power minimization during test application in sequential circuits," *IEEE Trans. Comput.*, vol. 51, no. 6, pp. 721–734, Jun. 2002. - [22] V. Iyengar and K. Chakrabarty, "Precedence-based, preemptive, and power-constrained test scheduling for system-on-a-chip," in *Proc. IEEE VLSI Test Symp.*, 2001, pp. 368–374. - [23] R. Chou, K. Saluja, and V. Agrawal, "Power constraint scheduling of tests," in *Proc. IEEE Int. Conf. VLSI Des.*, 1994, pp. 271–274. - [24] R. Sankaralingam, R. Oruganti, and N. Touba, "Static compaction tech-niques to control scan vector power dissipation," in *Proc. IEEE VLSI Test Symp.*, 2000, pp. 35–42. - [25] S. Wang and W. Wei, "A technique to reduce peak current and average power dissipation in scan designs by limited capture," in *Proc. Asia South Pacific Des. Autom. Conf.*, Jan. 2007, pp. 810–816. - [26] Badereddine, P. Girard, Pravossoudovitch, C. Landrault, A. Virazel, and H. Wunderlich, "Minimizing peak power consumption during scan testing: modification pattern with X filling heuristics," in Proc. Des. Test Integr. Syst. *Nanoscale Technol.*, 2006, pp. 359–364. - [27] R. Sankaralingam and N. Touba, "Controlling peak power during scan testing," in *Proc. 20th IEEE VLSI Test Symp.*, 2002, pp. 153–159. - [28] S. Remersaro, X. Lin, S. M. Reddy, I. Pomeranz, and J. Rajski, "Low shift and capture power scan tests," in *Proc. Int. Conf. VLSI Des.*, 2007, pp. 793–798. - [29] X. Wen, Y. Yamashita, S. Kajihara, L. Wang, K. Saluja, and K. Kinoshita, "On low-capture-power test generation for scan testing," in *Proc. 23rd IEEE VLSI Test Symp.*, 2005, pp. 265–270. - [30] R. David, Random Testing of Digital Circuits, Theory and Applications New York: Marcel Dekker, 1998. - [31] A. Abu-Issa and S. Quigley, "LT-PRPG: Power minimization tech-nique for test-per-scan BIST," in *Proc. IEEE Int. Conf. DTIS*, Mar. 2008, pp. 1–5. - [32] A. Abu-Issa and S. Quigley, "Bit-swapping LFSR for low-power BIST," *Electron. Lett.*, vol. 44, no. 6, pp. 401–402, Mar. 2008. # **Elevator Display Unit And Control System** # Raga Deepthi S.T.P 1& SrinivasaMurthy L2 Department of Electronics and Communication, Prakasam Engineering College, Ongole<sup>1</sup> Department of Electronics and Communications, SRM University, Chennai <sup>2</sup> E-mail: deepa.stp@gmail.com<sup>1</sup>, lsrinivasamurthy@gmail.com<sup>2</sup> Abstract - In this we are utilizing LPC2400 series microprocessors to develop Elevator multimedia display. The microcontrollers have a powerful processor with 4 kB of RAM, two Controller Area Network (CAN) channels, and a LCD control and so on. All of these features make the LPC2400 particularly suitable for industrial control and medical systems. Through the memory controller, we transfer flash disk image data to Nor Flash. Then, sending the pictures data to the LCD control buffer and the pictures will be on the LCD screen. At the same time, we monitor the status of the elevator control board and transmit information, and displaying floor, the elevator status, and temperature timely. Keywords: CAN, On-Chip interconnect, ARM, ## I. INTRODUCTION Nowadays Elevators are being used almost in all shopping malls, complexes, apartments and offices, etc. In this paper, we aim at, not only controlling the elevators movement; we also propose a mechanism to monitor the status inside the elevator, position of the elevator. Instead of these modules we are displaying the information to the LCD screen. The information we display is the text i.e. the floor information, its status, the temperature level etc..In addition to this we can send some pictures related to notices or any important information or any advertisements. In Commercial point of view it has a lot of scope. The application development of embedded display terminal based on ARM microprocessor is developed, by applying 32 bit RISC key microprocessor technique, embedded software technology, embedded GUI (Graphics User Interface) technology, CAN bus communication technology, information storage and management technology etc.[2]. The design of embedded hardware platform, of which the core is LPC2400CPU, is completed. In our existing system, we can't get any intimation about the environmental conditions inside the elevator. An elevator. Elevators are generally powered by electric motors that either drive traction cables or counterweight systems like a hoist, or pump hydraulic fluid to raise a cylindrical piston like a jack. There is no safeness in the existing systems. There is no provision to monitor the things happening inside the elevator .These things are extremely helpful in certain conditions. If anything happens inside the elevator or it is strucked in order to see it we should have some mechanism. There is no provision to control the elevator movement from the outside environment. This is extremely useful at the break down times. We overcome the above concepts or limitations in our proposed system. Rest of the paper is organized as follows. In section II explains about the ARM 7 micro controller. Section III explains about CAN Communication Technology. Section IV describes the Proposed System. Section V about experimental results and in Section VI discusses conclusions. # II. OVERVIEW OFARM 7 MICRO CONTROLLER The ARM is a 32-bit reduced instruction set computer (RISC) instruction set architecture (ISA) developed by ARM Limited. It was known as the Advanced RISC Machine, and before that as the Acorn RISC Machine. The ARM architecture is the most widely used 32-bit ISA in terms of numbers produced They were originally conceived as a processor for desktop personal computers by Acorn Computers, a market now dominated by the x86 family used by IBM PC compatible computers. But the relative simplicity of ARM processors made them suitable for low power applications[3]. This has made them dominant in the mobile and embedded electronics market as relatively cost small low and microprocessors microcontrollers. The LPC2119/2129/2194/2292/2294 is based on a 16/32 bit ARM7TDMI-STM CPU with real-time emulation and embedded trace support, together with 128/256 kilobytes (kB) of embedded high speed flash memory[4]. A 128-bit wide internal memory interface and unique accelerator architecture enable 32-bit code execution at maximum clock rate. For critical code size applications, the alternative 16-bit Thumb Mode reduces code by more than 30% with minimal performance penalty. With their compact 64 and 144 pin packages, low power consumption, various 32-bit timers, combination of 4-channel 10-bit ADC and 2/4 advanced CAN channels or 8-channel 10-bit ADC and 2/4 advanced CAN channels (64 and 144 pin packages respectively), and up to 9 external interrupt pins these microcontrollers are particularly suitable for industrial control, medical systems, access control and point-of-sale. Number of available GPIOs goes up to 46 in 64 pin package. In 144 pin packages number of available GPIOs tops 76 (with external memory in use) through 112 (single-chip application). Being equipped wide range of serial communications interfaces, they are also very well suited for communication gateways, protocol converters and embedded soft modems as well as many other general-purpose applications. ## III. CAN COMMUNICATION TECHNOLOGY The Controller Area Network (the CAN bus) is a serial communications bus for real-time control applications, operates at data rates of up to 1 Megabits per second, and has excellent error detection and confinement capabilities[1]. CAN was originally developed by the German company, Robert Bosch, for use in cars, to provide a cost-effective communications bus for in-car electronics and as alternative to expensive, cumbersome and unreliable wiring looms and connectors. The car industry continues to use CAN for an increasing number of applications, but because of its proven reliability and robustness, CAN is now also being used in many other control applications The LPC 2129 are based on a 16/32-bit ARM7TDMI-S CPU with real-time emulation and embedded trace support, together with 64/128/256 kB of embedded high-speed flash memory[10]. A 128-bit wide memory interface and unique accelerator architecture enable 32-bit code execution at maximum clock rate. For critical code size applications, the alternative 16-bit Thumb mode reduces code by more than 30 pct with performance penalty. With compact 64-pin package, low power consumption, various 32-bit timers, 4-channel 10-bit ADC, two advanced CAN channels. PWM channels and 46 fast GPIO lines with up to nine external interrupt pins these microcontrollers are particularly suitable automotive and industrial control applications, as well as medical systems and fault-tolerant maintenance buses. With a wide range of additional serial communications interfaces, they are also suited for communication gateways and protocol converters as well as many other general-purpose applications. Fig 1: CAN Layers This interface / protocol was designed to allow communications with in noisy environments. The LPC 2129 has two CAN controller modules. Each CAN controller has a register structure and the 8-bit registers of those devices have been combined into 32-bit words to allow simultaneous access in the ARM environment. In this we are utilizing one CAN channel for transmitting the information and other CAN channel for receiving the information. The information is related to temperature or text or it may be graphical image also. ## IV. PROPOSED SYSTEM The Proposed System provides more robustness to the elevator system. In our proposed system, the elevator movement is controlled using keypad. When a key is pressed the Arm processor senses the key pressed and make the elevator control motor to run and stop accordingly. In the elevator a cctv camera is attached, which is used to monitor inside of the elevator, through pc. The temperature inside the elevator is obtained using temperature sensor, which is given to the ARM processor. Using Graphic lcd two or three images can be displayed based on the command from specific keys. These images display the advertisements, trademarks, logos and any important notices. In order to implement the design on ARM board, first we had developed and implemented the design on the basic microcontroller AT89C51.In this temperature sensor is interfaced with the microcontroller. The Temperature Sensor we used here is LM35.The ADC utilized is 0808.The sensed temperature is analog so to show the reading in the lcd we need a analog to digital converter which if of type successive approximation. So the interfacing part of LCD, ADC, and temperature sensor to the microcontroller are show in the following snapshots. Fig 2: Block Diagram of Proposed System #### A. POWER SUPPLY The power supply section delivers constant output regulated power supply. A 0-12V/1 mA transformer is used for this purpose. The primary of this transformer is connected in to main supply through on/off switch& fuse for protecting from overload and short circuit protection. The secondary is connected to the diodes to convert 12V AC to 12V DC voltage. And filtered by the capacitors, which is further regulated to +5v, by using IC 7805. #### B. KEYPAD CONTROLLER The Keypad Controller is provided with 6 keys in which each key specifies the position of the elevator. In this way we are controlling the elevator. Each key is provided with a resistor to overcome high currents. #### C. TEMPERATURE SENSOR The temperature sensor used in the proposed system is LM35. It is an integrated circuit sensor that can be used to measure temperature with an electrical output proportional to the temperature (in°C). The advantage of LM35 compared to other temperature sensors are we can measure temperature more accurately than a using a thermistor, The sensor circuitry is sealed and not subject to oxidation, The LM35 generates a higher output voltage than thermocouples and may not require that the output voltage be amplified. # D. GRAPHICAL LCD A liquid crystal display (LCD) is a flat panel display, electronic visual display, or video display that uses the light modulating properties of liquid crystals (LCs). The features of the LCD used are, it is 128x64 LCD which implies 128 columns and 64 rows. In total there are (128x64 = 1024) pixels. This LCD is divided equally into two halves. Each half is controlled by a separate controller and consists of 8 pages. Each page consists of 8 rows and 64 columns. So two horizontal pages make 128 (64x2) columns and 8 vertical pages make 64 rows (8x8). The 16x2 Character LCDs have their own limitations; they can only display characters of certain dimensions. The Graphical LCDs are thus used to display customized characters and images. The Graphical LCDs find use in many applications; they are used in video games, mobile phones, and lifts etc. as display units # E. DC MOTOR An electric motor is a machine used to convert electrical energy to mechanical energy. In this paper the motor is used to replicate an elevator model. Motor output pins are connected to relays, which are used to trigger motor. #### F. ARM PROCESSOR The ARM7TDMI-S is a general purpose 32-bit microprocessor, which offers high performance and very low power consumption[7]. The ARM architecture is based on Reduced Instruction Set Computer (RISC) principles, and the instruction set and related decode mechanism are much simpler than those of micro programmed Complex Instruction Set Computers. This simplicity results in a high instruction throughput and impressive real-time interrupt response from a small and cost-effective processor core[8]. Pipeline techniques are employed so that all parts of the processing and memory systems can operate continuously. Typically, while one instruction is being executed, its successor is being decoded, and a third instruction is being fetched from memory. The ARM7TDMI-S processor also employs a unique architectural strategy known as THUMB, which makes it ideally suited to high-volume applications with memory restrictions, or applications where code density is an issue[9]. The key idea behind THUMB is that of a super-reduced instruction set. Essentially, the ARM7TDMI-S processor has two instruction sets, the standard 32-bit ARM instruction set and a 16-bit THUMB instruction set. The THUMB set's 16-bit instruction length allows it to approach twice the density of standard ARM code while retaining most of the ARM's performance advantage over a traditional 16-bit processor using 16-bit registers. This is possible because THUMB code operates on the same 32-bit register set as ARM code. THUMB code is able to provide up to 65% of the code size of ARM, and 160% of the performance of an equivalent ARM processor connected to a 16-bit memory system. #### G. CAN The LPC 2129 has two CAN controller modules. It can support data rate up to 1Mbits/sec. Each CAN controller has a register structure and the 8-bit registers of those devices have been combined into 32-bit words to allow simultaneous access in the ARM environment. A brief description of CAN protocol is given in section III. #### (i) Why CAN We have so many serial protocols in use. Some of them are UART, SPI, and I2C each having its own disadvantage. UART is a character based protocol for serial communications. UART is best applicable to devices which will be sending a message, but the individual characters of the message may be sent at an inconsistent rate The Serial Peripheral Interface Bus or SPI bus is a synchronous serial data link standard named by Motorola that operates in full duplex mode. The disadvantages includes it requires more pins on IC packages than I<sup>2</sup>C, even in the "3- Wire" variant, No hardware flow control by the No hardware slave acknowledgment, it Supports only one master device, No error-checking protocol is defined. I<sup>2</sup>C (Inter-Integrated Circuit; generically referred to as "two-wire interface") is a multi-master serial single-ended computer bus invented by Philips that is used to attach low-speed peripherals to a motherboard, embedded system, or cell phone or other electronics. Common I<sup>2</sup>C bus speeds are the 100 kbit/s standard mode and the 10 kbit/s low-speed mode, but arbitrarily low clock frequencies are also allowed. The limitations includes I<sup>2</sup>C supports a limited range of speeds. Hosts supporting the multi-megabit speeds are rare, I2C is a shared bus, there is the potential for any device to have a fault and hang the entire bus. # V. EXPERIMENTAL RESULTS The CAN Protocol was successfully implemented in ARM7 Microcontroller to monitor and display the temperature inside the elevator. The design for the model is successfully simulated and verified in keil compiler[5][6]. The Hardware is designed in such a way to display the status of elevator such as floor information and destination floor. The Images are displayed in the LCD when the elevator moves from one floor to other floor. The Temperature information is displayed in LCD by using CAN Controller. The Practical Hardware is implemented as shown in Fig 3. Fig 3: Hardware Kit of Proposed System Fig 4: Implementation of Proposed System ## VI. CONCLUSION In this paper, we propose a design scheme of elevator multimedia lcd based on ARM chip. The data terminal based on ARM embedded system is actualized. Through the experiment, it is proved that the veracity of parameter transmit based on CAN bus is very high and the display module can easily plug and play. Therefore, this control method and communication module has more reliability and mobility if wireless communication is employed. As a field equipment bus, the CAN is more reliable and higher performance to price ratio than other bus. To sum up, because of the superior performance of ARM system and the reliability and real-time performance of the data transmitted on CAN bus, the CAN bus combined with ARM is appropriate to industry and it has wide applied foreground in process control, motor manufacture equipment and so on. #### REFERENCES - [1]. Yantang Wang, Yibin Li, Rui Song, "Design and implementation of CAN device driver under embedded ARM Linux operating system," Computer Engineering and Applications, 2007, 43(15), 79-82. - [2]. Yamaoka T, Tamura H, "Information display method and process Considerations in the TRON/GUI," TRON Project International Symposium, 1993, No.10, 41-44. - [3]. Xianchun Wang, "Research on Embedded Graphical System Based on ARM and Linux," Microcomputer Information, 2007, vol 23, pp13-15. - [4] Jonathan Corbet, Greg Kroah-Hartman, Alessandro Rubini, "Linux Device Drivers, 3rd Edition," O'Reilly, 2005 - [5]. Jasmin Blanchette, Mark Summerfield, "C++ GUI Programming with Qt 3," Prentice Hall PTR, 2004. - [6] Donald Hearn, M.Pauline Baker, "C - [7]. ZhouLiGong.To explains profound theories in simple language about ARM7 based on LPC213x/214x [M]. Beijing aerospace university press□2006 - [8]. Labrosse, Jean J.MicroC/OS-II the real-time kernel [M].Amercan. CMP Books, 1957 - [9]. Guangxi electric FengBao joint laboratory. The ARM principle and Embedded application based on LPC2400 processor and IAR Development environment [M]. Electronics industry publishing House □2008 - [10]. PHILIPS company, LPC24XX User manual Rev. 02, 2008. # **License Plate Extraction Using Bernsen Algorithm** #### M.T. Ganesh Kumar & Mahalingamma Manur. Digital Electronics Communication, E&C Dept., Dayanand Sagar College of Engineering, Bangalore, India E-mail: manu.manur@gmail.com Abstract - In this paper License plate recognition (LPR) is proposed using Bernsen algorithm and support vector machine (SVM) integration. Bernsen algorithm mainly used for binarization, it also performs the shadow removal by combining with Gaussian filter where as SVM is used for character recognition. Our algorithm is robust to the variance of illumination, view, angle, position, size, color of license plate while working in a complex environment. **Keywords**—Bernsen algorithm, character recognition, feature extraction, license plate recognition (LPR), support vector machine (SVM). #### I. INTRODUCTION Automatic vehicle identification is an essential stage in intelligent traffic systems. Nowadays vehicles play a very big role in transportation. Also the use of vehicles has been increasing because of population growth and human needs in recent years. Therefore, control of vehicles is becoming a big problem and much more difficult to solve.[1] Automatic vehicle identification systems are used for the purpose of effective control. License plate recognition (LPR) is a form of automatic vehicle identification. It is an image processing technology used to identify vehicles by only their license plates. Real time LPR plays a major role in automatic monitoring of traffic rules and maintaining law enforcement on public roads. Since every vehicle carries a unique license plate, no external cards, tags or transmitters need to be recognizable, only license plate. Different applications may mean rather different License Plate Recognition Systems in terms of layout, hardware and technology, and even for the same applications manufacturers provide LPR systems with similar functionality but quite different structure. [2] LICENSE PLATE RECOGNITION (LPR) plays an important role in numerous applications such as unattended parking lots security control of restricted areas and traffic safety enforcement. A typical system for LPR consists of four parts, i.e., obtaining an image of the vehicle, license plate localization and segmentation, character segmentation and standardization, and character recognition. These things will be discussed in further section. #### II. RELATED WORK Recognition algorithms reported in previous research are generally composed of several processing steps, such as extraction of a license plate region, segmentation of characters from the plate and recognition of each character. Papers that follow this three step framework are covered according to their major contribution in this section. The major goal of this section is to provide a brief reference source for the researchers involved in license plate identification and recognition, regardless of particular application areas (i.e., billing, traffic surveillance etc.). [2][3] As far as extraction of the plate region is concerned, techniques based upon combinations of edge statistics and mathematical morphology featured very good results. In these methods gradient magnitude and their local variance in an image are computed. They are based on the property that the brightness change in the licenseplate region is more remarkable and more frequent than elsewhere. Block-based processing is also supported. Then, regions with a high edge magnitude and high edge variance are identified as possible license plate regions. Since this method does not depend on the edge of license-plate boundary, it can be applied to an image with unclear license-plate boundary and can be implemented simply and fast. A disadvantage is that edge-based methods alone can hardly be applied to complex images, since they are too sensitive to unwanted edges which may also show high edge magnitude or variance (e.g., the radiator region in the front view of the vehicle). Fuzzy logic has been applied to the problem of locating license plates. The authors made some intuitive rules to describe the license plate, and gave some membership functions for the fuzzy sets "bright", "dark", "bright and dark sequence", "texture" and "yellowness" to get the horizontal and vertical plate positions. [7][8] But these methods are sensitive to the license plate color and brightness and need longer processing time from the conventional color-based methods. Consequently, in spite of achieving better results, they still carry the disadvantages of the color-based schemes. Gabor filters have been one of the major tools for texture analysis. This technique has the advantage of analyzing texture in an unlimited number of directions and scales. A method for license plate location based on the Gabor Transform is presented in an ultimate number of directions and scales. The results were encouraging (98% for LP detection) when applied to digital images acquired strictly in a fixed and specific angle. But, the method is computationally expensive and slow for images with large analysis. [5][6] For a 2D input image of size NxN and a 2D Gabor filter of size WxW, the computational complexity of 2D Gabor filtering is in the order of W2N2, given that the image orientation is fixed at a specific angle. Therefore, this method was tested on small sample images and it was reported that further work remain to be done in order to alleviate the limitations of 2D Gabor filtering. [10] In this paper, our work focuses on a solution for image disturbance resulting from uneven illumination and various outdoor conditions such as shadow and exposure, which are generally difficult for obtaining successful processed results using traditional binary methods. Additionally, we discussed the feature extraction methods for Kannada, English, and numeric characters and adopted a support vector machine (SVM) for classification. The novel contributions of this paper are given as follows: 1) a novel binary method, i.e., the shadow removal method, which is based on an improved Bernsen algorithm combined with the Gaussian filter; and 2) a character recognition algorithm, which is the SVM integration, where the character features are extracted from the elastic mesh and the entire address character string is taken as the object of study, as opposed to a single character. # III. IMPLEMENTATION Implementation of LPR system involves the four steps namely preprocessing, localization, enhancement and segmentation. The structure of LPR is shown in figure 1. Figure 1: Structure of the LP Extraction system - 1. Preprocessing: In this step will take the input that may be image or video then convert in to frames. Take individual frame and resize that into 320\*240/160\*120.convert color image to gray ,then do image smoothening using Gaussian filter then do binarization. For binarization will use the bernsen algorithm. - **2.** Algorithm Used Bernsen Algorithm: It is mainly used for shadow removal. There are many other methods are there. Mainly Otsu and Adaptive thresholding. Otsu works on pixel based and Adaptive thresholding works on block based. The method uses a user-provided contrast threshold. If the local contrast (max-min within the radius of the pixel) is above or equal to the contrast threshold, the threshold is set at the local mid-grey value (the mean of the minimum and maximum grey values in the local window). If the local contrast is below the contrast threshold, the neighborhood is considered to consist only of one class and the pixel is set to object or background depending on the value of the mid-grey. if ( local contrast <= globalContrast)</pre> pixel = ( mid\_gray >= 128 ) ? object: background else pixel = (pixel >= mid gray)? object: background Local\_ contrast: It means taking mean value of single block and fixing threshold for it. Global\_contrast: For entire image will calculate threshold. - **3. Localization**: This will be done using CCA (Connected Component Analysis) algorithm. It works as follows: - Once region boundaries have been detected, it is often useful to extract regions which are not separated by a boundary. - Any set of pixels which is not separated by a boundary is call connected. - Each maximal region of connected pixels is called a connected component. - The set of connected components partition an image into segments. - Image segmentation is an useful operation in many image processing applications.[9] ## **Connected Neighbors** - Let @s be a neighborhood system. - 4-point neighborhood system - 8-point neighborhood system - Let c(s) be the set of neighbors that are connected to the point s. - For all s and r, the set c(s) must have the properties that - c(s) \_ @s - r 2 c(s), s 2 c(r) - Example: $c(s) = \{r \ 2 @ s : Xr = Xs\}$ - Example: $c(s) = \{r \ 2 \ @s : |Xr Xs| < Threshold\}$ - In general, computation of c(s) might be very difficult, but we won't worry about that. # IV. EXPERIMENTAL RESULTS Simulation is done by using matlab software. The results are as below Step1: First input image will be taken for preprocessing. Figure 2: License plate Recognition System Step 2: Input image is converted to grey image. Figure 3: Gray scale Image Step 3: Grey image is converted to Binary image using Bernsen's algorithm. Figure 4: Binary Image Step 4: Smoothening of the image using Gaussian filter Figure 5: Noise removed Based on Area Step 5: Noise removed based on height using CCA algorithm Figure 6: Noise removed Based on Height Step 6: Noise removed based on width Figure 7: Noise removed Based on Height Step 7: Identified License Plate Figure 8: Current License Plate # CONCLUSION In this paper we have implemented LPR system, using Berson and SVM algorithm. Bernsen algorithm is used for binarization & shadow removal, where SVM algorithm is used for character recognition. This system is designed for identifying the vehicles and their owners in the casino environment, stolen vehicle identification in parking lots. The Intelligent License Plate Recognition (ILPR) platform combined with the ITrak - Incident Reporting and Risk Management System, can improve the safety and security of both public and private facilities. # REFERENCES [1] Automatic Vehicle Identification by Plate RecognitionSerkan Ozbay, and Ergun Ercelebi World Academy of Science, Engineering and Technology 9 2005 - [2]. C. Anagnostopoulos, I. Anagnostopoulos, V. Loumos, and E. Kayafas, "A license plate-recognition algorithm for intelligent transportation system applications," *IEEE Trans. Intell. Transp. Syst.*, vol. 7, no. 3, pp. 377–392, Sep. 2006. - [3]. D. N. Zheng, Y. N. Zhao, and J. X. Wang, "An efficient method of license plate location," *Pattern Recognit. Lett.*, vol. 26, no. 15, pp. 2431–2438, Nov. 2005. - [4]. J. B. Jiao, Q. X. Ye, and Q. M. Huang, "A configurable method for multistyle license plate recognition," *Pattern Recognit.*, vol. 42, no. 3, pp. 358–369, Mar. 2009. - [5]. H. Caner, H. S. Gecim, and A. Z. Alkar, "Efficient embedded neuralnetwork- based license plate recognition system," *IEEE Trans. Veh. Technol.*, vol. 57, no. 5, pp. 2675–2683, Sep. 2008. - [6]. P. Comelli, P. Ferragina, M. N. Granieri, and F. Stabile, "Optical recognition of motor vehicle license plates," *IEEE Trans. Veh. Technol.*, vol. 44, no. 4, pp. 790–799, Nov. 1995 - [7]. S. L. Chang, L. S. Chen, Y. C. Chung, and S.W. Chen, "Automatic license plate recognition," *IEEE Trans. Intell. Transp. Syst.*, vol. 5, no. 1, pp. 42–52, Mar. 2004. - [8]. Z. G. Xu and H. L. Zhu, "An efficient method of locating vehicle license plate," in *Proc. 3rd ICNC*, 2007, pp. 180–183. - [9]. G. H. Ming, A. L. Harvey, and P. Danelutti, "Car number plate detection with edge image improvement," in *Proc. 4th Int. Symp. Signal Process Appl.*, 1996, vol. 2, pp. 597–600. - [10]. H. A. Hegt, R. J. Dela Haye, and N. A. Khan, "A high performance license plate recognition system," in *Proc. IEEE Int. Conf. Syst., Man, Cybern.*, 1998, vol. 5, pp. 4357–4362 # Design And Simulation of Micro Based Built In Self Test For Fault Detection And Repair In Memories #### D.Satheesh Chandra Kumar, G. Kiran Kumar Department of Electronics (E.C.E), Audi Sankara College of Engineering, Gudur, Nellore, Andhra Pradesh A.P, India. E- mail:satheesh412@gmail.com, kiran.ece@audisankara.com Abstract - As embedded memory area on-chip is increasing and memory density is growing, problem of faults is growing exponentially. Newer test algorithms are developed for detecting these new faults. These new March algorithms have much more number of operations than the March algorithms existing earlier. An architecture implementing these new algorithms is presented here. This is illustrated by implementing the newly defined March SS algorithm. Along with the fault diagnosis a word-oriented memory Built-in Self Repair methodology, which supports on-the-fly memory repair, is employed to repair the faulty locations indicated by the MBIST controller presented. **Keywords-** DCT Built-In Self Test (BIST); Built-In Self Repair (BISR); Defect-Per Million (DPM); Memory Built-in Self Test (MBIST); Microcoded MBIST; Memory Built-In Self Repair (MBISR). #### I. INTRODUCTION According to the 2001 ITRS, today's system on chips (SoCs) are moving from logic dominant chips to memory dominant chips in order to deal with today's and future application requirements. The dominating logic (about 64% in 1999) is changing to dominating memory (approaching 90% by 2011) [1] as shown in Figure. Figure 1. The future of embedded memory These shrinking technologies give rise to new defects and new fault models have to be defined to detect and eliminate these new defects. These new fault models are used to develop new high coverage test and diagnostic algorithms. The greater the fault detection and localization coverage, the higher the repair efficiency; hence higher the obtained yield. Memory repair is the necessary, since just detecting the faults is no longer sufficient for SoCs, hence both diagnosis and repair algorithms are required. Thus, the new trends in memory testing will be driven by the following items: - •Fault modeling: New fault models should be established in order to deal with the new defects introduced by current and future (deep-submicron) technologies. - •Test algorithm design: Optimal test/diagnosis algorithms to guarantee high defect coverage for the new memory technologies and reduce the DPM level. - •BIST: The only solution that allows at-speed testing for embedded memories. - •BISR: Combining BIST with efficient and low cost repair schemes in order to improve the yield and system reliability as well. March SS [5] and March RAW [3] are examples of two such newly developed test algorithms that deal with detecting some recently developed static and dynamic fault models. A new microcoded BIST architecture is presented here which is capable of employing these new test algorithms. A word-oriented BISR array is used to repair the faulty memory locations as indicated by the BIST controller. The interface of repair array with BIST controller and Memory under test is shown in Figure 2. Figure 2. Principal Structure: MBIST and repair logic interface #### II. MICROCODE MBIST CONTROLLER The DPM screening of a large number of tests applied to a large number of memory chips showed that many wellknown fault models, developed before late 1990s failed to explain the occurrence of complex faults. This implied that new memory technologies involving high density of shrinking devices lead to newer faults. This stimulated the introduction of new fault models, based on defect injection and SPICE simulation. Some such newly defined fault models [2] are Write Disturb Fault (WDF), Transition Coupling Fault (Cft), Deceptive Read Disturb Coupling Fault (Cfdrd). Another class of faults called Dynamic faults which require more than one operation to be performed sequentially in time in order to be sensitized have also been defined. [3-4] These new faults cannot be easily detected by established tests like March C-, rendering it insufficient/ inadequate for today's and the future high speed memories. Thus architectures which have been developed to implement earlier tests like March C- may not be able to easily implement these newer test algorithms. The reason is that most of the newly developed algorithms have up to six or seven (or even more) number of test operations per test element. For example test elements M1 through M4 of March SS algorithm have five test operations per element. This is in contrast with some of the algorithms developed earlier like March B, MATS+, March Cwhich only had up to two operations per March element. Thus some of the recently developed architectures [6] that had been specifically designed to implement these older algorithms can only implement up to two march operations per march element, rendering them incapable of easily implementing the new test algorithms. The proposed architecture has the ability to execute algorithms with unlimited number of operations per March element. Thus almost all of the recently developed March algorithms can be successfully implemented and applied using this architecture. This has been illustrated in the present work by implementing March SS algorithm. The same hardware has also been used to implement other new March algorithms. This requires just changing the Instruction storage unit, or the instruction codes and sequence inside the instruction storage unit. The instruction storage unit is used to store predetermined test pattern. #### A) Methodology The block diagram of the BIST controller architecture together with fault diagnosis interface through input MUX shown in Figure 3. The BIST Control Circuitry consists of Clock Generator, Pulse Generator, Instruction Pointer, Microcode Instruction storage unit, Instruction Register. The Test Collar circuitry consists of Address Generator, RW Control and Data Control. Clock Generator synthesizes local clock signals namely Clock2, Clock3, Clock4, Clock5 and Clock6 based on the input clock (named Clock1) as shown in Figure 4. As can be easily seen from the figure, the derived clocks (Clock2 though Clock6) are all one fourth the frequency of the input clock, Clock1. Clock4 and Clock5 are delayed versions of Clock3, whereas Clock6 is inverse of Clock3. These local clock signals are used for the rest of the circuitry as shown in Figure 3. Figure 4. Simulated waveform of Clock generator Module **Pulse Generator** generates a 'Start Pulse' at positive edge of the 'Start' signal marking the start of test cycle. Instruction Pointer points to the next microword, that is the next march operation to be applied to the memory under test (MUT). Depending on the test algorithm, it is able to i) point at the same address, ii) point to the next address, or iii) jump back to a previous address. The flowchart in Figure 5 precisely describes the functioning of the Instruction Pointer. Figure 5. Flowchart illustrating functional operation of Instruction Pointer Here, 'Run complete' indicates that a particular march test operation has marched through the entire address space of MUT in increasing or decreasing order as dictated by the microcode. *Instruction Register* holds the microword (containing the test operation to be applied) pointed at by the Instruction Pointer. The various relevant bits of microword are sent to other blocks from IR. Address Generator points to the next memory address in MUT, according to the test pattern sequence. It can address the memory in forwards as well as backwards direction. RW Control generates read or write control signal for MUT, depending on relevant microword bits. Data Control generates data to be written to or expected to be read out from the memory location being pointed at by the Address Generator. The Address Generator, RW Control and Data Control together constitute the Memory *Test Collar*. *Input Multiplexer* directs the input to memory by switching between test algorithm input and input given externally during the normal mode. The control signal for this multiplexer is also given externally by the user. If it indicates test mode then internally generated test data by BIST controller is given to the memory as input from the Test Collar. In case of Normal mode the memory responds to the external address, data and read/write signals. Fault Diagnosis module works during the test mode to give the fault waveform which consists of positive pulses whenever the value being read out of the memory does not match the expected value as given by Test Collar. In addition, it also gives the diagnostic information like the faulty memory location address and the expected/correct data value. This diagnostic information is used for programming the repair redundancy array as explained in the following section. #### B) Microcode Instruction specification. The microcode is a binary code that consists of a fixed number of bits, each bit specifying a particular data or operation value. As there is no standard in developing a microcode MBIST instruction [7], the microcode instruction fields can be structured by the designer depending on the test pattern algorithm to be used. The microcode instruction developed in this work is coded to denote one operation in a single microword. Thus a five operation March element is made up by five micro-code words. The format of 7-bit microcode MBIST instruction word is as shown in Fig. 6. Its various fields are explained as follows: Bit #1 (=1) indicates a valid microcode instruction, otherwise, it indicates the end of test for BIST Controller. Bits #2, #3 and #4 are used to specify first operation, in-between operation and last operation of a multi-operation March element, interpreted as shown in Figure 6. | #1 | #2 | #3 | #4 | #5 | #6 | #7 | | | |-------|----|----|----|----------------------------------------------|----------------------------|------|--|--| | Valid | Fo | Io | Lo | I/D | R/W | Data | | | | | ·V | V | V | | 11 11 11 11 11 11 11 11 11 | M | | | | | Fo | Io | Lo | 18 | Descripti | on | | | | | 0 | 0 | 0 | A single operation element | | | | | | | 1 | 0 | ю | First operation of a Multi-operation element | | | | | | | 0 | 1 | 0 | | | | | | | | 0 | 0 | 1 | Last Op | peration of a M<br>element | | | | Figure 6. Format of Microcode Instruction word Bit #5 (=1) notifies that the memory under test (MUT) is to be addressed in decreasing order; else it is accessed in increasing order. Bit #6 (=1) indicates that the test pattern data is to be written into the MUT; else, it is retrieved from the memory under test. Bit #7 (=1) signifies that a byte of 1s is to be generated (written to MUT or expected to be read out from the MUT); else byte containing all zeroes are generated. The instruction word is so designed so that it can accommodate any existing or future March algorithm. The contents of Instruction storage unit for March SS algorithm are shown in Table 1. The first march element M0 is a single operation element, which writes zero to all memory cells in any order, whereas the second march element M1 is a multi operation element, which consists of five operations: i) R0, ii) R0, iii) W0, iv) R1 and v) W1. MUT is addressed in increasing order as each of these five operations is performed on each memory location before moving on to the next location. | Table 1 | | | | | | | | | | |------------|----------|-------|-------|-------|-------|-----------------|------------------|--|--| | | #1 Valid | #2 Fo | #3 Io | #1 Lo | (0/1) | #6 R/W<br>((∳1) | #7 Data<br>(0/1) | | | | M0: χ W0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | | | | M1: { R0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | | | | RO | 1 | 0 | 1 | 0 | 0 | 0 | 0 | | | | WO | 1 | 0 | 1 | 0 | 0 | 1 | 0 | | | | Rl | 1 | 0 | 1 | 0 | 0 | . 0 | 0 | | | | W1} | 1 | 0 | 0 | 1 | 0 | 1 | 1 | | | | M2: † (R1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | | | | R1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | | | | W1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | | | | R1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | | | | WO | 1 | 0 | 0 | 1 | 0 | 1 | 0 | | | | M3: [{R0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | | | | R0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | | | | WO | 1 | 0 | 1 | 0 | 1 | 1 | 0 | | | | R0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | | | | W1} | 1 | 0 | 0 | 1 | 1 | 1 | 1 | | | | M4: {R1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | | | | R1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | | | | W1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | | | | R1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | | | | W0} | 1 | 0 | 0 | 1 | 1 | 1 | 0 | | | | M5:χR0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | | | | - 5% | 0 | X | X | X | X | X | X | | | # III. WORD REDUNDANCY MBISR The BISR mechanism used here [17] employs an array of redundant words placed in parallel with the memory. These redundant words are used in place of faulty words in memory. For successful interfacing with already existing BIST solutions as shown in Fig. 2, the following interface signals are taken from the MBIST logic: - 1) A fault pulse indicating a faulty location address - 2) Fault address - 3) Expected data or correct data that is compared with the results of Memory under test The MBISR logic used here can function in two modes. A) Mode 1: Test & Repair Mode In this mode the input multiplexer connects test collar input for memory under test as generated by the BIST controller circuitry. A redundancy word is as shown in Figure 7.The fault pulse acts as an activation signal for programming the array. The redundancy word is divided into three fields. The FA (fault asserted) indicates that a fault has been detected. The IE and Figure 7. Redundancy Word Line OE signals respectively act as control signals for writing into and reading from the data field of the redundant word. The complete logic of programming of memory array is shown by Figure 8. Figure 8. Flowchart describing programming of redundancy array #### B) Mode 2: Normal Mode During the normal mode each incoming address is compared with the address field of programmed redundant words. If there is a match, the data field of the redundant word is used along with the faulty memory location for reading and writing data. The output multiplexer of Redundant Array Logic then ensures that in case of a match, the redundant word data field is selected over the data read out (R/W=0) of the faulty location in case of a read signal. This can be easily understood by the redundancy word detail shown in Figure 7. Figure 9 shows the Repair Module including the redundancy array and output multiplexer and its interfacing with the existing BIST module. Figure 9. Redundancy Array Logic ### IV. RESULTS Mentor Graphic's ModelSim has been used to verify the functionality and timing constraints of Verilog coded BIST module, Repair redundancy array, and their interface. The full architecture containing all these modules has been successfully synthesized using Xilinx ISE 9.2i. Design Synthesis Report shows that a total of 163 slices, 198 slice flip-flops, 188 4 input LUTs and 30 bonded IOBs have been used in the synthesis. The simulation waveform of a fault-free SRAM is shown in Figure 10. The top module here is 'bist' which comprises of glue logic and interfacing of BIST Controller (including test collar), MUT, fault diagnosis module, and repair array. As the START signal goes high, indicating the start of test, the first March element M0 of March SS algorithm is executed. This being a write signal, no values are read out from the memory to be compared with expected or correct values and hence the output FAULT waveform of comparator is showing high impedance for some initial clock cycles. As read operation starts at the beginning of execution of M1 element, the values from MUT are read out and compared with the expected values. The FAULT waveform shows a 'low' level throughout the test for a fault-free SRAM. The SRAM model is also amended to be in defective state by inserting faults. The simulated waveform is shown in Figure 11. The inserted faults are Deceptive Read Disturb fault (DRDF) at location 11, Write Disturb Fault (WDF) at location 13, Deceptive Read Disturb Coupling fault (CFdrd) at location 9 (victim) due to location 10 (aggressor), Write Disturb Coupling Fault (CFwd) at location 14 (victim) due to location 15 (aggressor) [9]. The fault detect waveform shows 12 pulses due to the above faults in given four locations, as the test elements march through MUT to uncover these defects. The above stated faults cannot be detected by March Calgorithm but are easily detected by March SS Algorithm which has been implemented by the architecture presented in this work. Figure 12 shows the simulated waveform of fault diagnosis module, magnified at seventh pulse to indicate how signals like 'fault pulse', 'faulty location addresses and 'correct data' are generated by this module for successful interfacing with the Redundancy Array logic. Figure 10. Simulated waveform of fault-free SRAM Figure 11. Simulated waveform of faulty SRAM Figure 12. Simulated waveform Fault Detection module magnified at the 7th Fault Pulse #### **CONCLUSION** The simulation results have shown that the microcoded MBIST architecture is successfully able to implement new test algorithms. Implementation of a single test operation in one micro word ensures that any future test algorithms with any number of test operations per test element is successfully implemented using the current BIST architecture. Moreover, it provides a flexible approach as any new march algorithm, other than March SS the can also be implemented using the same BIST hardware by changing the instructions in the microcode storage unit, without the need to redesign the entire circuitry. A detailed power, time and area overhead analysis of this architecture is underway and efforts are being made to develop a power-optimized BIST architecture for embedded memories. The word redundancy uses spare words in place of spare rows and columns. This repair mechanism avoids lengthy redundancy calculations as suggested by some other authors in their works [18], [19], as it stores faulty location addresses immediately supporting on-the-fly fault repair and can be interfaced easily with existing MBIST logic. ## REFERENCES - [1] International SEMATECH, "International Technology Roadmap for Semiconductors (ITRS): Edition 2001" - [2] S. Hamdioui, G.N. Gaydadjiev, A.J. van de Goor, "State-of-art and Future Trends in Testing Embedded Memories", *International Workshop on Memory Technology, Design and Testing (MTDT'04)*, 2004. - [3] S. Hamdioui, Z. Al-Ars, A.J. van de Goor, "Testing Static and Dynamic Faults in Random Access Memories", *In Proc. of IEEE VLSI Test Symposium*, pp. 395-400, 2002. - [4] S. Hamdioui, et. al, "Importance of Dynamic Faults for New SRAM Technologies", *In IEEE Proc. Of European Test Workshop*, pp. 29-34, 2003. - [5] S. Hamdioui, A.J. van de Goor and M. Rodgers, "March SS: A Test for All Static Simple RAM Faults", In Proc. of IEEE International Workshop on Memory Technology, Design, and Testing, pp. 95-100, Bendor, France, 2002. - [6] N. Z. Haron, S.A.M. Junos, A.S.A. Aziz, "Modelling and Simulation of Microcode Built-In Self test Architecture for Embedded Memories", In Proc. of IEEE International Symposiumon Communications and Information Technologies pp. 136-139, 2007. - [7] R. Dean Adams, "High Performance Memory Testing: Design Principles, Fault Modeling and Self-Test", Springer US, 2003. - [8] "Xilinx ISE 6 Software Manuals and help PDF Collection", http://toolbox.xilinx.com/docsan/xilin x7/books/manuals.pdf - [9] A.J. van de Goor and Z. Al-Ars, "Functional Fault Models: A Formal Notation and Taxonomy", In Proc. of IEEE VLSI Test Symposium, pp. 281-289, 2000. - [10] Zarrineh, K. and Upadhyaya, S.J., "On Programmable memory built-in self test architectures," *Design, Automation and Test in Europe Conference and Exhibition 1999*. Proceedings, 1999, pp. 708-713 - [11] Sungju Park et al, "Microcode-Based Memory BIST Implementing Modified March Algorithms", Journal of the Korean Physical Society, Vol. 40, No. 4, April 2002, pp. 749-753. - [12] A.J. van de Goor, "Using March tests to test SRAMs", *Design & Test of Computers, IEEE,* Volume: 10, Issue: 1, March 1993 Pages: 8-14. - [13] R. Dekker, F. Beenker and L. Thijssen, "Fault Modeling and Test Algorithm Development for StaticRandom Access Memories", Proc. IEEE Int. Test Conference, Washington D.C., 1988, 343-352. - [14] R.Dekker, F. Beenker, L. Thijssen. "A realistic fault model and test algorithm for static random access memories". *IEEE Transactions on CAD*, Vol. 9(6), pp 567-572, June 1990. - [15] B. F. Cockburn: "Tutorial on Semiconductor Memory Testing" *Journal of Electronic Testing: Theory and Applications*, 5, pp 321- 336 1994 Kluwer Academic Publishers, Boston. - [16] A.J. van de Goor, "Testing Semiconductor Memories, Theory and Practice" ComTex Publishing, Gouda, Netherlands. - [17] V. Schober, S. Paul, and O. Picot, "Memory builtin self-repair using redundant words," in Proc. Int. Test Conf. (ITC), Baltimore, Oct. 2001, pp. 995-1001. - [18] C.-T. Huang, C.-F. Wu, J.-F. Li, and C.-W. Wu, "Built-in redundancy analysis for memory yield improvement," IEEE Trans. on Reliability, vol. 52, no. 4, pp. 386-399, Dec. 2003. - [19] J.-F. Li, J.-C. Yeh, R.-F. Huang, and C.-W. Wu, "A built-in selfrepair design for RAMs with 2-D redundancies," IEEE Trans. On VLSI Systems, vol. 13, no. 6, pp. 742-745, June 2005. # VHDL Implementation of Genetic Algorithm for 2-bit Adder # <sup>1</sup>Vedavathi.A, <sup>2</sup>Meena. K.V & <sup>3</sup>Gayatri.Malhotra 1&2 The Oxford College of Engineering, Visvesvaraya Technological University, Bangalore, India. 3CSG, ISRO satellite Centre Indian Space Research Organisation Bangalore, India Email: starvedhachitty@gmail.com, kvmeena@gmail.com & gayatri@isac.gov.in Abstract - Future planetary and deep space exploration demands that the space vehicles should have robust system architectures and be reconfigurable in unpredictable environment. The Evolutionary design of electronic circuits, or Evolvable hardware (EHW), is a discipline that allows the user to automatically obtain the desired circuit design. The circuit configuration is under control of Evolutionary algorithms. The most commonly used evolutionary algorithm is Genetic Algorithm. The paper discusses on Cartesian Genetic Programming for evolving gate level designs and proposes Evolvable unit for 2-bit adder based on Genetic Algorithm. Keywords— Cartesian Genetic Programming, Evolvable hardware, Genetic Algorithm, 2-bit Adder, Reconfigurable FPGA, Virtual Reconfigurable Circuit. #### I. INTRODUCTION Digital reconfigurable circuits implemented using FPGA suits many different area of applications. Generally reconfiguration sequence is determined at design time. To adapt the system to a new environment needs totally new configuration of hardware that was not considered at design time. Evolvable Hardware is the field that is associated with dynamic adaption of hardware using bio-inspired techniques like Genetic Algorithm (GA). GA is a stochastic search method that operates on a population of potential solutions and applies the principle of survival of the fittest to produce better approximation to solution. Evolvable Hardware (EHW) is useful broadly in two areas: (1) for automatic generation of new solutions and (2) implementation of autonomous adaptive devices. For case (1), the evolutionary algorithm (EA) will be used in design phase and this approach is applied in this paper. For case (2), the EA is responsible for continual adaptation of a device in changing environment or for automatic functional recovery [3] of a device after damage. The application of GA in combination logic design is existing from long time. Different methods proposed are able to solve the functional output for combination logic and some methods emphasize on the optimization of gate usage too [1], [2]. Reconfigurable computing is performed to effectively utilize hardware resources and to maximize the system efficiency while combining it to evolvable hardware concept we can accomplish, what is called the evolvable computing. The objective of this paper is to propose a technique to perform evolvable computing at the level of HDL. As a case study 2-bit adder circuit for FPGA at the level of VHDL is designed. The structure of the digital circuit is encoded into one-dimensional genotype like in Cartesian Genetic Programming (CGP) and represented by a finite string of bits. The types of gates used are Wire, AND, OR, XOR, NOT and its combinations. The paper is organized as follows. Section 2 is about FPGA for evolvable hardware and GA. Section 3 is about Virtual Reconfigurable Circuit and its implementation using CGP. Section 4 describes the proposed evolvable unit for 2-bit adder. Section 5 Experimental results. Section 6 is about discussion and conclusion. # II. FPGA FOR EVOLVABLE HARDWARE Implementation of Evolvable system using FPGA is cost-effective and flexible. Currently Xilinx is the most popular platform for implementation of Evolvable systems. There are various approaches for Xilinx FPGA to use it for re-configuration [4], [5]. The FPGA based EHW can be done using one of two options: The FPGA serves in the fitness calculation only. The EA executed on a personal computer, sends configuration bits to FPGA in order to obtain fitness values. #### (2) The entire system is built on FPGA. In a case of EHW, the chromosomes are transmitted to the configuration bit streams and the configuration bit streams are uploaded into the FPGA. Each configuration bit in the chromosome defines some architecture feature of the reconfigurable hardware. A bit might give the state of switch that connected to circuit components and every candidate configuration must be checked for fitness -which measures how closely it matches a target response[6],[1]. There are two different ways of checking fitness, but which is method used depends on whether the evolution is done extrinsically or intrinsically: - (1) Extrinsic evolution in which a software circuit simulator is used to evaluate circuit configuration. - (2) Intrinsic evolution in which circuits are sought using a hardware accelerator. Every chromosome is downloaded and physical testing measures fitness using hardware. In most of commercial system such features as flexibility, scalability and implementation easiness play vital role selection of appropriate design. The flexibility defines whether the system accepts to evolve different kinds of problem. The scalability defines the capability of a designed system to able to scale from evolution scale from evolution of small tasks to larger problem without significant modification in the structure. And finally implementation easiness that define how easy to implement the desired solution. These three parameters will give a high priority in development of an EHW implementation [9], [4]. Fig. 1 Illustrated types of intrinsic EHW implementation. Fig. 1 Types of implementation of intrinsic EHW. The proposed FPGA designs featured not only with significant change in chip structure, but also in limitation of accessible information about chip performance. Hence FPGA designers suggest solution to implement the run time configuration intrinsic EHW system: FPGA-Based Run-Time Configuration System [4] FPGA-Based Run-Time Configuration System The proposed FPGA-based Run-Time Configuration System is divided into two parts: Evolutional strategy and Reconfigurable Hardware [4]. Fig. 2 FPGA-Based Runtime Configuration System Architecture The reconfigurable hardware is used as the target to evolve. It executes the desired functionalities. The fitness function calculation is also partially implemented in this module [4]. Es block is composed of the implementation of evolutional Strategy itself, the fitness value evolution, the chromosomes back-up and the communication protocol management in order to configure the evolved target (Reconfigurable hardware). This system will allow using the smallest number of gates possible and to have a very high speed during the reconfiguration as well as for performance of evolutionary process. The complexity of the communication between the ES and Reconfigurable Hardware is expressed by the response of the protocols in order to able to reconfigure the target from the ES in otherwise it is necessary to understand the bit stream. #### GENETIC ALGORITHM A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which deals with three operators namely reproduction mutation and crossover. Fig.2. shows components of evolutionary Algorithm, Evolutionary algorithm (EA) is a computer algorithm that is based on principles of natural evolution and self adaption. The major components of evolutionary algorithm are representation variation, evolution based on fitness, selection, population, termination. Fig3. Components of Evolutionary algorithm (EA) The major components of evolutionary algorithm are representation, variation, evolution (based on fitness), selection, population, and termination [15]. Representation: it refers the data structure that encodes all the problem parameters needed to describe a solution. Variation: it is a random process that creates a new solution from existing solution by changing some or all parameters. Population is a number of chromosomes that are available to the best. The fitness of chromosome is defined as the percentage of the correct output bits for every input combination of the complete specification. In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. The fitness function is defined over the genetic representation and measures the *quality* of the represented solution. The fitness function is always problem dependent. Initialization: Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Selection: During each successive generation, a proportion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Reproduction: The next step is to generate a second generation population of solutions from those selected through genetic operators: crossover (also called recombination), and/or mutation. Termination: This generational process is repeated until a termination condition has been reached. Fig4. Flow chart of working principle of GA. # III. VIRTUAL RECONFIGURABLE CIRCUIT AND ITS IMPLEMENTATION USING CGP VRC is a reconfiguration layer developed on the top of FPGA in order to obtain fast reconfiguration and application-specific programmable elements. The use of VRC has allowed us to introduce a novel approach to the design of complete evolvable system in single FPGA [6], [12]. When the VRC is uploaded into the FPGA then the configuration bits stream has to cause that there will be created the following units in the FPGA: an array of programmable elements (PE), programmable inter connection network, configuration port. In most cases the VRC takes a regular Two-Dimensional array of programmable elements. A very efficient and successful approach -called Cartesian Genetic Programming (CGP) has been developed for the evolutionary design. The main advantage of CGP is that it uses the representation similar to real reconfigurable hardware. For dynamic adaption EHW using bio-inspired techniques like Genetic Algorithm (GA). GA is a stochastic search method that operates on a population of potential solutions and applies the principle of survival of the fittest to produce better approximation to solution. HScone GA, Roulette GA, and Compact GA this are the different genetic algorithms are used for implementation. Fig. 5 Genetic Processing Unit (GPU) Evolutionary Execution Phase #### Genetic Programming In CGP, a reconfigurable circuit is modeled as an array of $n_c$ (columns) x $n_r$ (rows) programmable nodes [12]. The number of circuit inputs $n_i$ and outputs $n_0$ are fixed. A node inputs can be connected to the outputs of some elements in the proceeding column or some of the circuit inputs. A node has up to $n_n$ inputs and a single output. Each and every node has $n_f$ functions (i.e. programmed to implement) defined in F set [2], [4]. The nodes in the same column are not allowed to be connected each other, and any node may connect or not connected. The circuit output can be taken from any node output. Feedback is not allowed and thus the only combinational circuits can be designed. Fig. 6 shows a model of a reconfigurable circuit and its corresponding configuration bit stream uploaded to establish the circuit connection. The numbers 0, 1, 2 represents input and the numbers 12,10,6 represents output. Fig. 6 Example of a circuit and its configuration in a Cartesian genetic programming with parameters: $F=\{FO,F1\},n_c=3,nr=2,ni=3,no=2,nf=2.$ Shows a model of a reconfiguration information is (input1,input 2,function F) 1,2,1, 2,3,0, 3,5,7, 0,0,4, 3,4,6, 0,6,1, 6,8.the last two integers indicates outputs of the circuit The length of the chromosome [1] measured in the gene is $$\Lambda = n_r \cdot n_c (n_n + l) + n_o \tag{1}$$ Were $n_r$ is number of rows, $n_c$ is number of columns, $n_o$ is number of outputs, $n_n$ is a node of $n_i$ inputs ## 3.1 Size of the design circuit Consider that CGP, as defined in above section, is issued to evolve combinational circuits. Let a set H contain all logical functions of the form given by the mapping $\{0,1\}^{ni} \rightarrow \{0,1\}^{no}$ all most all Boolean functions have the number of gates required for implementation of a particular circuit at least $2^{ni}ni^{-1}$ this is known as Shannon's effect [13],[1]. $$|H| = 2^{no2^{ni}} \tag{2}$$ Assume that the outputs fixed to the last column of the PEs and not modified by evolution. In this case I back is I, and then the number of chromosomes that form different configurations of the reconfigurable circuit is given by $$|C| = nf^{nc.nr}(ni + nr)^{nn.nr(nc-1)}ni^{nn.nr}$$ (3) Where |C| is in fact the size of the search space [14]. Consider in above fig 6, where each PE can be connected either to some of primary inputs(5 options) i.e. 3 inputs and 2 functions(Fo,F1),hence 3 bits are needed to select the configuration of a single input of a PE. We will need 9 bits to configure a single PE. Let us analyze the length of the chromosome is $\Lambda=20$ or $(n_r)$ . $n_c n_i + n_o = 2 \times 3 \times 3 + 2 = 20$ ) and size of the search space the number of physically different circuits is up to |C| = 10<sup>11</sup>.while the number of different logical behavior is $|H| = 2^{16}$ . The choice of PE roughly corresponds to the Shannon's effect which indicates that about 6PE are needed to implement any circuits of 3inputs if no gates were shared. Implementation of VRC is based on multiplexers. The configuration memory is connected to multiplexers that control the routing and selection of function in PE. # IV. THE PROPOSED EVOLVABLE UNIT FOR 2-BIT ADDER Evolvable Hardware technique have the potential to significantly increase the functionality of deployed hardware system for space mission as it enables self-configurability [15].EHW uses simulated evolution to search for new hardware configurations. Evolutionary algorithm is the most commonly used algorithm. Evolutionary algorithm (EA) is a computer algorithm that is based on principles of natural evolution and self adaptation [6]. The most commonly used Evolutionary Algorithm is Genetic Algorithm (GA). GA have been applied to EHW because of it's because of its binary representation witch match's perfectly with the configuration bits used in FPGAs. Fig. 7 shows a proposed 2-bit model of a reconfigurable circuit and its corresponding configuration bit stream uploaded to establish the circuit connection. The configuration of every PE is represented in the chromosomes as (input1, input2, and function) which define the connection of two inputs and the function realized in the PE. Every node i.e. PE is programmed to implement one of $n_f$ functions defined in the F set. a0, a1, b0, b1 are the input and the numbers 12,10,6 represents output. The numbers 12, 6 represents output sum and number 10 represents output carry. Each of the input numbers is represented in binary. Crossover and mutation operations are performed with thus 4 bit binary numbers. The length of the chromosome measured in genes, $\Lambda = 51.$ Let a set H contain all logical functions of the form given by the mapping $\{0,1\}^{ni} \rightarrow \{0,1\}^{no}$ [13], were $|H| = 2^{no2^{ni}}$ The number of gates required for implementation of a particular circuit is at least $2^{ni}ni^{-1}$ is 4. 2-bit adder is implemented by using 13 gates shown in fig. 7. PEs (nodes) are connected either to some of the primary inputs or in the previous column. They are 16 options, i.e. 4-bits are needed to select a single input of a PE. Because PE has two inputs and other 3-bits are needed to select function i.e. 5 functional options, hence we will need 11 bits (4input1 bits +4 input2 bits+3 function bits) to configure a single PE. Fig. 7 2 bit adder circuit and its configuration in (CSG) Cartesian genetic Programming with parameters :F={F1,F2,F3,F4,f5}, $n_c$ =4.nr=4, $n_i$ =4, no=3, $n_f$ =5.the configuration informationis:a0,b0,1, a1,b1,4,a0,b1,3, a1,b1,6, 1,8,2, 1,b1,1, a1,b1,2, a0,b0,8,3,3,5, 4,a1,2, 7,5,2, 3,11,3, 9,8,3,12,10,6. The last integers determined the connection of outputs. Gates 13, 14, 15 are not utilized Table I Logic function expressed in terms of Functions | Logic Function | Functions (F) | |----------------|---------------| | XOR GATE | F1 | | AND GATE | F2 | | OR GATE | F3 | | NOT GATE | F4 | | WIRE | F5 | #### 4.1 Algorithm for Random number generation The initial population is obtained using Pseudo random number generation technique - 1. Initially 44 bits binary data is given as input. - Then MSB and LSB is XORed for each and every rising clock edge. - 3. Store the result obtained in some location and repeat step 2 for all possibilities. #### 4.2 Algorithm for fitness calculation - Store the user outputs for different combinations off 2-bit adder inputs. - For each input combinations, store the evolved outputs. - 3. Compare the reference outputs with user outputs for each of the combination of inputs. - 4. If equal, increment the counter, and assign fitness value. - 5. After comparison get next input combinations. Check all 15 different input combinations. - 6. If yes, then stop, else go to step 3. - 4.3 Simple generational genetic algorithm procedure: - 1. Choose the initial population of individuals - Evaluate the fitness of each individual in that population - Repeat on this generation until termination (sufficient fitness achieved): - 4. Select the best-fit individuals from reproduction - 5. Breed new individuals through crossover and mutation operations to give birth to offspring - Evaluate the individual fitness of new individuals Replace least-fit population with new individuals # V. EXPERIMENTAL RESULTS Fig 8.Random number generation Fig 9 Fitness calculation #### VI. DISCUSSION AND CONCLUSION This paper has contained a design of 2-bit adder, FPGA based EHW development and proposed Evolution unit set-up can be applied for future space application. Proposed 2-bit adder is designed by using 16 number of PEs. The configuration of every PE is represented in the chromosomes as input1, input2, and function. We will need 11 bits to configure a single PE. The length of the chromosome is $\Lambda$ =51 and size of the search space is up to $|C| = 10^{11}$ . An Evolvable Hardware technique has potentially and significantly increased the functionality of deployed hardware system for space mission as its enable Self-Reconfigurability. Future planetary and deep space exploration demands that the space vehicles should have robust system architecture and be dynamically reconfigurable to explore unpredictable environment. For dynamic adaption EHW using bio-inspired techniques like Genetic Algorithm (GA). GA is a stochastic search method that operates on a population of potential solutions and applies the principle of survival of the fittest to produce better approximation to solution. Hsclone GA, Roulette GA, and Compact GA this are the different genetic algorithms are used for implementation. Implementation of Evolvable system using FPGA is cost-effective and flexible. Currently Xilinx is the most popular platform for implementation of Evolvable systems. There are various approaches for Xilinx FPGA to use it for reconfiguration. The use of proposed intrinsic run-time configurations will allows one to overcome at the certain level the scalability problems and reduce the time required to re-design the system every time. #### REFERENCES - [1] K.H.Chong,I.B .Aris,M.A.Sinan "Digital Circuit Structure Design via Evolutionary Algorithm Method"Journal of Applied science 7(3):380-385,2007.ISSN 1812-5654. - [2] Lucas Sekania,"Extrinsic and Intrinsi Evolution of multifunctional combinational modules" IEEE Congress on Evolutionary Computation 2006Adrian stoica,"Towards Evolvable Hardware Chips: Experiments with a Programmable Transister Array". - [3] Adian stoica, Ricardo Zebulum, and Dider Keymeum "Progress and Challenge in Building Evolvable Devices" Callifornia Institute of Technology. - [4] Cyrille Lambert, Tatiana Kalgovnova, and Emanuele stomea" FPGA –Based System for Evolvable Hardware" processing of world acodomy of science, Engineering and technology volume 12 march 2006 ISSN 1307-688. - [5] Zdenec Vasicaek and Lukas Sekania "An Evolvable Hardware System in Xillinx Virtex-II pro FPGA"Innovative Computation and Application, Vol. 1,No. 1,2007. - [6] L Sekania , Stepan Friedl "An Evolvable Combination unit for FPGAs –Draft". Computing Informatics, Vol. 23,2004,461-486, V 2005-july-7. - [7] Yang Zhange, Stephen L, Smith, Andy M. Tyrrell "Digital Circuit Design using Intrinsic Evolvable Hardware" proceeding of the 2004 NASA/DOD Conference on Evolution Hardware (EH'04)0-7695-2145-2/04 2004 IEEE - [8] L.Sekanina," Towards Evolvable IP cores for FPGAs" In:proc. Of The 2003 NASA/DoD Conference on Evolvable Hardware, Los Alamitos, US, ICSP, pp. 145-154, ISBN 0-7695-1977-6. 2003 - [9] Carlos A. Coello Coello, Alan D. Christian and Arturo Hernoandez Aguirre "Automated design of Combinational Logic Circuits using Genetic Algorithms". - [10] L.Sekanina,s Feriedl "On Routine Implementation of vvirtual Evolvable devices using COMBO6" In:proc. Of The 2004 NASA/DoD Conference on Evolvable Hardware,Los Alamitos,US, ICSP, pp.63-70, ISBN 0-7695-2145-2, 2004. - [11] JBits: Java based interface for reconfigurable computing Steve Guccione, Delon Levi and Prasanna Sundararajan Xilinx Inc., 2100 Logic Drive San Jose, CA 95124 (USA) - [12] Miller, J-Thomson, P:"Cartesian Genetic Programming "In proc of the 3<sup>rd</sup> European Conference on Genetic Programming, LNCS 1802, Springer verlag, Berlin, 2000, pp. 121-132. - [13] Gruska J "foundation of computing combinational circuits using evolvable hardware" Thomson publishing computer press 1997. - [14] Sekania L-Rujica R:Design of the Specific Fast Reconfigurable Chip using Common FPGA.In: proc. Of the 3<sup>rd</sup> IEEE Design and diagnostic of Electronic Circuits and System DDEC,00,Polygrafia SAF Bratislava,Slovakia 2000,pp. 161-168. - [15] G.Malhotr,"Evolvable Hardware and its relevant for Future Space Mission", DOS-ISRO 10-12 Dec 2008. - [16] D B Verneker. G Malhotra, V colaco "Reconfigurable FPGA using Genetic Algorithm" International Conference and Workshop on emerging Technology (ICWET 2010). - [17] Xin Yao and Tetsuya Higuchi ,"Promises and Challenge of Evolvable Hardware "IEEE Transaction On System, Man and Cybernetic-part C:Application and reviews ,Vol.29 No.1Feb 1999. # Small Size Printed Antenna Array for LTE/WWAN with LTE MIMO Operation for Mobile Communication ## T.Thomas, Y.V.B.Reddy & Dr. K.Veeraswamy QIS college of Engg. & Tech. Ongole , India E-mail: t.thomas.455@gmail.com, yvbreddy06@yahoo.com & kilarivs@yahoo.com Abstract - A low profile antenna array comprising a main antenna for LET/WWAN operation and an Auxiliary antenna to combine with main antenna for LTE MIMO operation in the mobile handset is presented. Proposed antenna array with low profile supports the LTE-700, GSM-850, GSM-900, GSM-1800, GSM-1900, UMTS-1900, UMTS-2100, LTE-2300, and LTE-2500. Coupled feed loop and monopole type antenna are embedded to get wide lower and upper bands. Chip inductor is used in the design to minimize the physical length required for desired resonant modes. Both the antennas together provide the constructional requirements to get the LTE MIMO operation. The radiating strips are shorted to the ground plane to achieve the required bandwidth. Folded radiating strips are also employed in the design for the low profile so that the proposed antenna arrangement may be best suitable for modern mobiles in which the area occupied by each element is a very important design metric. **Keywords**-coupled feed loop antenna; chip inductive element; shorting radiating strip to ground; folded radiating strip. #### I. INTRODUCTION LTE technology [2] is going to stand in the front line in the selection of mobile communication technologies for the future requirements of mobile user, such as the higher band width, and higher data rates for multi media applications. LTE technology uses smart antennas for improving the bandwidth, i.e., Multi Input Multi Output technique in which multiple antennas are used to collect the faded signal. For the mobile phones basic criteria to be satisfied is low profile. That means with the small size antenna, wide frequency band should be achieved. With the small size, the micro strip patch antennas give small band widths. To get higher band width, the length of the antenna should be sufficiently longer, which occupies a large area in the mobile phone. By embedding inductive elements in the design physical length of the antenna can be reduced. Inductor may be implemented as chip inductor [17] or as distributed inductive strip with equivalent length with proper inductance. In addition to this, instead of directly feeding the radiating strip, couple feeding is suggestible. Feeding the radiating strip using T-section feeding strip is advantageous, since it increases the bandwidth. Moreover shorting the radiating strip to ground also affects the bandwidth. Ground plane area affects the bandwidth [27]. The positioning of two antennas is also selected carefully as SAR of the antenna array greatly depends on position of the antennas [14]-[16]. Locating two antennas at the bottom edge corners of the system board is advantageous when compared to placing the antenna at other possible position on the system board. Keeping antennas at the bottom edge of the system board makes a possibility to accommodate many other necessary peripherals. Between two antennas there is a gap of 10mm in which a USB [13] connector can be placed, as it is the one of the many necessary components of the modern mobile phones. #### II. PROPOSED ANTENNA ARRAY Proposed antenna array contains two antennas namely main antenna and auxiliary antenna. both contains coupled feed loop and monopole antennas to provide required wide frequency bands one at lower part of the frequency scale to cover LTE-700, GSM-850, GSM-900, and other at higher part of frequency scale to cover the GSM-1800, GSM-1900, UMTS-1900, UMTS-2100, LTS-2300, and LTE-2500 [8]-[9]. Figure 1 shows the structural details of the proposed antenna. As shown in figure the proposed antenna arrangement consists of main antenna, auxiliary antenna, system board, ground plane and feed lines. Figures 2 and 3 shows the constructional details of the proposed antenna array. A 120X60-mm2 system board is used for the design. FR4 substrate with $\varepsilon_r$ = 3, and 0.8mm thickness is considered for the system board. On the front side of system board, two antennas are fabricated and on the backside, a ground sheet is formed over an area of 100 X 60 mm². Main antenna and auxiliary antennas are arranged on 30 X 20 mm² and 20 X 20 mm² no ground portions of the system board respectively. To include the body effect of the mobile phone casing in the results, a plastic casing is introduced with $\epsilon_{\rm r}=3$ and conductance of 0.02 siemens in the simulation process. Printed monopole is embedded in the design to have wide lower band. Widened end part of monopole is responsible for lower frequency band. Figure 1. Perspective view and Front view of Proposed Antenna Array . The resonant mode generated by monopole, is directly depends on the physical length of monopole. To generate desired resonant mode, formation of strip with required length on the system board, where only limited space is available, is not suggestible. To overcome this design challenge inductive element is used to decrease the physical length required to generate the desired Figure 2. Auxiliary Antenna resonant mode. Implementation of inductive element in design is possible either by equivalent length or by chip inductor. Figure 3. Main Antenna In this antenna arrangement structure, chip inductor design is used. The value of chip inductor is selected in such a way that it compensates the capacitive effect caused by decreasing the physical length. Usage of chip inductor makes an easy way of accommodating monopole having less length then the required length yet capable of generating the desired resonant mode [18]-[23]. Moreover, widened end portion of monopole is folded to get low profile, which is desirable in case of modern mobile phones. This portion of monopole radiates the energy. Folded radiating strip width is selected properly as it will shift the obtained frequency bands. Printed coupled feed loop antenna is responsible for generating both low band and upper bands. In this design instead of directly feeding the loop, T-section coupling strip is used. This type of feeding enhances the width of lower and upper bands. A spacing of 0.3 mm is used to feed the loop with T-section coupling strip. To excite the antennas, Feed strip is used, whose length and width plays an important role in impedance matching to generate the appropriate resonant modes. To enhance the bandwidth, the antenna is shorted with the ground plane, which makes use of ground plane as a radiator. The simulation results confirmed the importance of ground plane as a radiating element in generating the higher bandwidths. This proposed antenna arrangement eliminates the potributed ground [1]. As potributed ground is removed, the middle portion of the upper band gives less $S_{11}$ value [1]. # III. RESULTSE Figure 4 shows the S11 value over the frequency scale. From the graph, it is observed that the proposed antenna array supports required lower and upper frequency bands to cover specified bands. From Figure 4 S11 value for 698-998MHz and 1710-2690MHz bands is less than -6dB which is the industrial constraint. Figure 4 indicates lower and upper supporting bands with 0.35GHz and 1.2GHz bandwidth respectively, which are wider than the lower and upper bandwidths that are obtained in [1]. The effect of the feed line is obvious from the graphs shown in Figure 5. As the length of the feed line increases the lower band may not be obtained. As the length of the feed line decreases to the lower values, required lower band may be obtained. A different length of 50mm and 30mm is used for feed line in the simulation. Figure 4: S11 Magnitude Effect of Feed Line length on the S11 value: When Feed Line length is 50mm the impedance matching is not proper, hence the lower and upper bands are not sufficiently wide enough to cover the required bands. Effect of radiating strip of main Antenna and auxiliary antenna on S11 value: When the Folded Radiating strip length is 14mm; it is not able to provide the required wide bandwidths. As the folded Radiating Strip length is made 19mm long the length of the Radiating strip is enough to provide the required lower and upper band widths to cover all frequency bands. Figure 5 : Feed Line length & Radiating Strip Length effects on S11 Magnitude Figure 6 shows the radiation characteristics of the proposed antenna array. It is obvious from the Figures 7 and 8 that the antenna array exhibits the omnidirectional radiation characteristics. Figure : 6 Radiation characteristics of the proposed antenna #### **CONCULSION** This paper presents the method to implement a set of two antennas namely main and auxiliary antenna in a small area to meet the design constraints of modern mobile phones in which area for each functional element is limited. Removal of the potributed ground in between main and auxiliary antenna will enhance the S11 value in the middle portion of the upper frequency band which makes the design suitable for the UMTS-2100 band. Folded radiating strip gives the low profile to total structure, which is a desirable feature for mobile phones. In addition to this, usage of chip inductor in the design reduces the physical length required to generate the resonant modes. And also this inductive strip is responsible for the widened bandwidth in the lower portion of the frequency scale. Figure-7 Radiation characteristics of the proposed antenna Phi = 90 deg. Figure-8 Radiation characteristics for Theta = 90 deg #### REFERENCES - [1] Ting-Wei Kang\*, Kin-Lu Wong, Ming-Fang Tu "Internal Handset Antenna Array for LTE/WWAN and LTE MIMO Operations" Proceedings of the 5th European Conference on Antennas and Propagation (EUCAP) - [2] Naveen Kumar and Asit Kadayan "Long Term Evolution (LTE) – Specifications" The Indian Journal of Telecommunications vol-60, Issue-2, Aug-2011, ISSN NO. 0497-1388, Page NO.43-48 - [3] http://www.radioelectronics.com/info/cellulartelecomms/lte-longterm-evolution/lte-frequency-spectrum.php - [4] http://en.wikipedia.org/wiki/UMTS\_frequency\_ban - [5] http://en.wikipedia.org/wiki/GSM\_frequency\_bands - [6] http://en.wikipedia.org/wiki/3GPP\_Long\_Term\_Ev olution, Wikipedia, the free encyclopedia: 3GPP Long Term Evolution. - [7] T. W. Kang and K. L. Wong, "Internal printed loop/monopole combo antenna for LTE/GSM/UMTS operation in the laptop computer," *Microwave Opt. Technol. Lett.*, Vol. 52, pp. 1673-1678, Jul. 2010. - [8] C. T. Lee and K. L. Wong, "Planar monopole with a coupling feed and an inductive shorting strip for LTE/GSM/UMTS operation in the handset," *IEEE Trans. Ant. Prp.*, Vol. 58, pp. 2479-2483, Jul. 2010. - [9] K. L. Wong, W. Y. Li, "Small-size coupled fed printed PIFA for internal eight-band LTE/GSM/UMTS handset antenna," Microwave - Opt. Technol. Lett., Vol. 52, pp. 2123-2128, Sep.2010 - [10] Multiple-input multiple-output, Wikipedia, http://en.wikipedia.org/wiki/MIMO. - [11] K. L. Wong, C. H. Chang, B. Chen and S. Yang, "Three-antenna MIMO system for WLAN operation in a PDA phone," *Microwave Opt. Technol. Lett.*, Vol. 48, pp. 1238-1242, Jul. 2006. - [12] V. Plicanic, B. K. Lau, A. Derneryd and Z. Ying, "Actual diversity performance of a multiband diversity antenna with hand and head effects," *IEEE Trans. Antennas Propagate.*, Vol. 57, pp. 1547-1556, May 2009 - [13] http://en.wikipedia.org/wiki/Universal\_Serial\_Bus, Wikipedia, the free encyclopedia: Universal Serial Bus. - [14] American National Standards Institute (ANSI), "Safety levels with respect to human exposure to radio-frequency electromagnetic field, 3 kHz to 300 GHz," ANSI/IEEE standard C95.1, Apr. 1999. - [15] C. H. Chang and K. L. Wong, "Printed □/8-PIFA for penta-band WWAN operation in the handset," *IEEE Trans. Antennas Propagate.*, Vol. 57, pp. 1373-1381, Mar. 2009. - [16] C. H. Li, E. Ofli, N. Chavannes and N. Kuster, "Effects of hand phantom on handset antenna performance," *IEEE Trans. Antennas Propagate*. Vol. 57, pp. 2763-2770, Jul. 2009. - [17] K. L. Wong, M. F. Tu, C. Y. Wu and W. Y. Li, "On-board 7-band WWAN/LTE antenna with small size and compact integration with nearby ground plane in the handset," *Microwave Opt. Technol. Lett.*, Vol. 52, pp. 2847-2853, Dec. 2010. - [18] Y. W. Chi and K. L. Wong, "Quarter-wavelength printed loop antenna with an internal printed matching circuit for GSM/DCS/PCS/UMTS operation in the handset," *IEEE Trans. Antennas Propagate.*, Vol. 57, pp. 2541-2547, Sep. 2009. - [19] K. L. Wong and W. Y. Chen, "Small-size printed loop antenna for penta-band thin-profile handset application," *Microwave Opt. Technol. Lett.*, Vol. 51, pp. 1512-1517, Jun. 2009. - [20] J. Carr, Antenna Toolkit, 2nd edition, pp. 111~112, Newness, Oxford, U.K., 2001. - [21]T. W. Kang and K. L. Wong, "Chip-inductorembedded small-size printed strip monopole for WWAN operation in the handset," *Microwave Opt. Technol. Lett.*, Vol. 51, pp. 966-971, Apr. 2009. - [22] C. H. Chang and K. L. Wong, "Small-size printed monopole with a printed distributed inductor for penta-band WWAN handset application," *Microwave Opt. Technol. Lett.*, Vol. 51, pp. 2903-2908, Dec. 2009. - [23] K. L. Wong and S. C. Chen, "Printed single-strip monopole using a chip inductor for penta-band WWAN operation in the handset," *IEEE Ant. Wireless Propagate. Lett.*, Vol. 58, pp. 1011-1014, Mar. 2010. - [24] S. Blanch, J. Romeu and I. Corbella, "Exact representation of antenna system diversity performance from input parameter description," *Electron Lett.*, Vol. 39, pp. 705-707, May 2003. - [25] F. M. Caimi and M. Mongomery, "Dual feed, single element antenna for WiMAX MIMO application, International Journal of Antennas and Propagation," Vol. 2008, Article ID 219838. - [26] M. Karaboikis, C. Soras, G. Tsachtsiris and V. Makios, "Compact dual-printed inverted-F antenna diversity systems for portable wireless devices," *IEEE Antennas Wireless Propagate. Lett.*, Vol. 3, pp. 9-14, Dec. 2004. - [27] P. Vainikainen, J. Ollikainen, O. Kivekas and I. Kelander, "Resonatorbased analysis of the combination of mobile handset antenna and chassis," *IEEE Trans. Antennas Propagate.*, Vol. 50, pp. 1433-1444, Oct. 2002. # Levelset Function Evolution Using Edge Filter Techniques To Image Segmentation # Sai Jyothsna .Meda & V Jai Kumar Electronics and communication Engineering, QIS College of Engg. & Tech. Ongole, India E-mail: medajosna.sai@gmail.com, Vjaikumar.vjk@gmail.com Abstract - Level set methods can represent contours of complex topology and are able to handle topological changes, such as splitting and merging, in a natural and efficient way. In conventional level set methods, the LSF typically develops irregularities during its evolution, which cause numerical errors and eventually destroy the stability of the level set evolution. To overcome this difficulty, a numerical remedy, commonly known as reinitialization as introduced to restore the regularity of the LSF and maintain stable level set evolution. Reinitialization is performed by periodically stopping the evolution and reshaping the degraded LSF as a signed distance function. However, the practice of reinitialization raises serious problems. This paper proposes a conventional level set formulation with a distance regularization term and an external energy term that drives the motion of the zero level contours toward desired locations. In particular, we provide a double-well potential for the distance regularization term. To demonstrate the effectiveness of the Distance regularization LSF Evolution formulation, we apply it to an edge-based active contour model for image segmentation. We apply Perwitt edge filter techniques to further improve the computational efficiency and accuracy. Keywords- reinitialization, level set method, image segmentation. #### I. INTRODUCTION The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. When applied to a stack of images, typical in Medical imaging, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like cubes. Some of the practical applications of image segmentation are [1]: - Iris recognition - Fingerprint recognition - Traffic control systems - Brake light detection Several general-purpose algorithms and techniques have been developed for image segmentation. Those are as follows - A. Partial differential equation-based methods - B. Parametric methods - C. Level set methods Level sets are a robust and flexible mathematical framework for the representation of evolving curves and surfaces. At the heart of the level set approach is the PDE governing the evolution of the level set function describing a particular surface [3]. Level set function is represented by $\varphi$ . If the curve C moves in the normal direction with a speed $\upsilon$ , then the level set function $\varphi$ satisfies the *level set equation*. [4] In this paper, we present a new variational formulation for geometric active contours that forces the level set function to be close to a signed distance function, and therefore completely eliminates the need of the costly re-initialization procedure. Our variational formulation consists of an internal energy term that penalizes the deviation of the level set function from a signed distance function, and an external energy term that drives the motion of the zero level set toward the desired image features, such as object boundaries. The resulting evolution of the level set function is the gradient flow that minimizes the overall energy functional. The proposed variational level set formulation has three main advantages over the traditional level set formulations. First, a significantly larger time step can be used for numerically solving the evolution partial differential equation, and therefore speeds up the curve evolution. Second, the level set function can be initialized with general functions that are more efficient to construct and easier to use in practice than the widely used signed distance function. Third, the level set evolution in our formulation can be easily implemented by simple finite difference scheme and is computationally more efficient. The proposed algorithm has been applied to both simulated and real images with promising results [5]. #### II. LEVEL SET METHOD In recent years, a large body of work on geometric active contours, i.e., active contours implemented via level set methods, has been proposed to address a wide range of image segmentation problems in image processing and computer vision [5]. Level set methods were first introduced by Osher and Sethian [6] for capturing moving fronts. The existing active contour models can be broadly classified as either parametric active contour models according to their representation and implementation. Geometric active contours are independently introduced by Caselles et al. [7] and Malladi et al. [9], respectively. These models are based on curve evolution theory [10] and level set method. The basic idea is to represent contours as the zero level set of an implicit function defined in a higher dimension, usually referred as the level set function, and to evolve the level set function according to a partial differential equation (PDE). In level set formulation of moving fronts (or active contours), the fronts, denoted by C, are represented by the zero level set $C(t) = \{(x,y) \mid \varphi(t,x,y) = 0\}$ of a level set function $\varphi(t,x,y)$ . The evolution equation of the level set function $\varphi$ can be written in the following general form: $$\frac{\partial \varphi}{\partial t} + F |\nabla \varphi| = 0 \tag{1}$$ Which is called *level set equation* [11]? The function F is called the speed function. For image segmentation, the function F depends on the image data and the level set function $\varphi$ . In it the level set function $\varphi$ can develop shocks, very sharp and/or flat shape during the evolution, which makes further computation highly inaccurate. To avoid these problems, a common numerical scheme is to initialize the function $\varphi$ as a signed distance function before the evolution, and then "reshape" (or "re-initialize") the function $\varphi$ to be a signed distance function periodically during the evolution. Re-initialization has been extensively used as a numerical remedy in traditional level set methods [5–7]. The standard re-initialization method is to solve the following *reinitialization equation*. $$\frac{\partial \varphi}{\partial t} = sign(\varphi_0)(1 - |\nabla \varphi|) \tag{2}$$ where $\varphi_0$ is the function to be re-initialized, and $sign(\varphi)$ is the sign function. There has been copious literature on re-initialization methods [13, 14], and most of them are the variants of the above PDE-based method. Unfortunately, if $\varphi_0$ is not smooth or is much steeper on one side of the interface than the other, the zero level set of the resulting function $\varphi$ can be moved incorrectly from that of the original function [4]. So far, re-initialization has been extensively used as a numerical remedy for maintaining stable curve evolution and ensuring desirable results. From the practical viewpoints, the re-initialization process can be quite complicated, expensive, and have subtle side effects. In the following sections, the variational level set formulation distance regularized level set evolution (DRLSFE) can be easily implemented by simple finite difference scheme, which over comes the drawbacks associated with re-initialization. #### III. DRLSFE As discussed before, it is crucial to keep the evolving level set function as an approximate signed distance function during the evolution, especially in a neighborhood around the zero level set. It is well known that a signed distance function must satisfy a desirable property of $|\nabla \varphi| = 1$ . Conversely, any function $\varphi$ satisfying $|\nabla \varphi| = 1$ is the signed distance function plus a constant [16]. Naturally, we propose the following integral $$P(\varphi) = \int_{\Omega} \frac{1}{2} (|\nabla \varphi - 1|^2 dx dy$$ (3) as a metric to characterize how close a function $\varphi$ is to a signed distance function in $\Omega \subset R^2$ . This metric will play a key role in our variational level set formulation. With the above defined functional $P(\varphi)$ , we propose the following variational formulation $$E(\varphi) = \mu R_p(\varphi) + E_{ext}(\varphi) \tag{4}$$ where $R_p(\varphi)$ is the level set regularization term defined in the following, $\mu>0$ is a constant, $E_{\rm ext}(\varphi)$ and is the external energy that depends upon the data of interest that would drive the motion of the zero level curve of $\varphi$ . The derived level set evolution for energy minimization has an undesirable side effect on the LSF in some circumstances. To avoid this side effect, we introduce a new potential function which is aimed to maintain the signed distance property $|\nabla \varphi| = 1$ only in a vicinity of the zero level set, while keeping the LSF as a constant, with $|\nabla \varphi| = 0$ , at locations far away from the zero level set. To maintain such a profile of the LSF, the potential function must have minimum points. Such a potential is a double-well potential as it has two minimum points (wells). In this paper, we denote by $\frac{\partial E}{\partial \varphi}$ the Gateaux derivative (or first variation) of the functional E, and the following evolution equation: $$\frac{\partial \varphi}{\partial t} = -\frac{\partial E}{\partial \varphi} \tag{5}$$ is the gradient flow [15] that minimizes the functional E. In image segmentation, active contours are dynamic curves that move toward the object boundaries. To achieve this goal, we explicitly define an external energy that can move the zero level curves toward the object boundaries. Let I be an image, and g be the edge indicator function defined by $$g = \frac{1}{1 + |\nabla G_{\sigma} * I|^2},$$ where $G_{\sigma}$ is the Gaussian kernel with standard deviation $\sigma$ . We define an external energy for a function $\varphi(x,y)$ as below $$E_{g,\lambda,\nu}(\varphi) = \lambda Lg(\varphi) + \alpha Ag(\varphi)$$ (6) Where $\lambda > 0$ and $\alpha$ are constants, and the $Lg(\varphi)$ terms and $Ag(\varphi)$ are defined by $$Lg(\varphi) = \int_{\Omega} g \delta(\varphi) |\nabla \varphi| \, dx dy$$ (7) And $$Ag(\varphi) = \int_{\Omega} gH(-\varphi)dxdy \tag{8}$$ respectively, where $\delta$ is the univariate Dirac function, and H is the Heaviside function. Now, we define the following total energy functional $$E(\varphi) = \mu P(\varphi) + E_{g,\lambda,p}(\varphi) \tag{9}$$ The external energy $E_{g,\lambda,\nu}(\varphi)$ drives the zero level set toward the object boundaries, while the internal energy $\mu P(\varphi)$ penalizes the deviation of $\varphi$ from a signed distance function during its evolution. To understand the geometric meaning of the energy $Lg(\varphi)$ in (7), computes the length of the zero level curve of $\varphi$ . The energy functional $Ag(\varphi)$ in (8) is introduced to speed up curve evolution. The derivative of the function E in (9) can be written as $$\frac{\partial E}{\partial \varphi} = -\mu [\Delta \varphi - div(\frac{\nabla \varphi}{|\nabla \varphi|})] - \lambda [\delta(\varphi) div(g\frac{\nabla \varphi}{|\nabla \varphi|})] - \alpha g \delta(\varphi) \quad \text{where } \Delta$$ is the Laplacian operator. For minimization of the functional *E* is the following gradient flow: $$\frac{\partial \varphi}{\partial t} = \mu [\Delta \varphi - div(\frac{\nabla \varphi}{|\nabla \varphi|})] + \lambda [\delta(\varphi) div(g \frac{\nabla \varphi}{|\nabla \varphi|})] + \alpha g \delta(\varphi) \tag{10}$$ This gradient flow is the evolution equation of the level set function in the DRLSFE method. The second and the third term in the right hand side of (10) correspond to the gradient flows of the energy functional $\lambda Lg(\varphi)$ and $\alpha Ag(\varphi)$ ), respectively, and are responsible of driving the zero level curve towards the object boundaries. # IV. LEVEL SET EVELUTION WITH EDGE FILTER To review the entire DRLSFE the all edge techniques are applied and to efficiently reduce the computational cost of a level set method [4], edge filter techniques are applied to the DRLSFE. Following are the steps to implement the edge filter techniques to DRLSFE. Step1) Initialization. Initialize an LSF $\varphi$ to a function $\varphi_0$ . Then, construct the initial narrowband $B^0_r = \bigcup_{(i,j) \in Z^0} N^{(r)}_{i,j}$ , where $Z^0$ is the set of zero crossing points of $\varphi^0$ . DRLSFE denote by $\varphi_{i,j}$ an LSF defined on a grid. A grid point (i,j) is called a zero crossing point. The set of all the zero crossing points of the LSF is denoted by Z. **Step 2) Applying Prewitt filter on narrowband.** Apply Prewitt filter to the narrowband. We can even apply other edge filters 1) sobel 3) average 4) unsharp 5) disk 6) laplacian to the input Image. Perwitt filter efficiently reduce the computational cost of level set method and increases the accuracy. **Step 3) Evolution of LSF.** Update $\varphi_{i,j}^{k+1}$ on Narrowband $Bf_r^k$ as in equation which can be expressed as $\varphi_{i,j}^k + \tau L(\varphi_{i,j}^k) where, k = 0,1,2...m$ which is an iteration process used in the numerical implementation of DRLSFE. Step 4) Update the Narrowband. Determine the set of all the zero crossing pixels of $\varphi_{i,j}^{k+1}$ on $Bf_r^k$ , denoted by $Zf^{k+1}$ . Then, update the narrowband by setting $Bf_r^{k+1} = \bigcup_{(i,j) \in \mathbb{Z}^{k+1}} N_{i,j}^{(r)}$ . Step 5) Assign values to new pixels on the narrowband. For every point (i,j) in $Bf_r^{k+1}$ but not in $Bf_r^k$ , set $\varphi_{i,j}^{k+1}$ to h if, $\varphi_{i,j}^k > 0$ or else set to $\varphi_{i,j}^{k+1} > -h$ , where h is a constant, which can be set to r+1 as a default value. Step 6) Determine the termination of iteration. If either the zero crossing points stop varying for $^m$ consecutive Iterations or $^k$ exceeds a prescribed maximum number of iterations, then stop the iteration, otherwise, go to Step 2. To review the entire DRLSFE the all edge techniques are applied and verified on each input images, in all these cases the edges follows exact contour of object where its results are shown in the following section. #### V. EXPERIMENT RESULTS To entire DRLSFE, the edge techniques are applied and their comparisons are as follows: Figure 1 is the input image and Figure 2 is the initial level set function, for which we have outputs for respective filters as shown in below figures for 110 iterations. Fig 2a) Gaussian filter- Fig 2b) Prewitt filter-Fig 2c) Sobel filter-Fig 2d) Average filter- Fig 2e) unsharp filter- Fig 2f) Laplacian filter- Fig 2g) Log filter. As per the above analysis Average filter would catch the edge information more accurate than Gaussian. Original input image and Initial LSF of input image 1.'Gaussian' Filter: Still not able to detect the edges exactly Figure 1. Fig 2a. Final level set contour, 110 iterations **FILTER** .Fig 2f Fig 2g #### VI. CONCLUSION In this paper, we have applied more edge filter techniques to the existing DRLSFE method. Among all these cases applying the Average filter shows that the edge follows exact contour of object rather than missing some edge information in previous method. The present new variational level set formulation completely eliminates the need for the reinitialization. The proposed level set method can be easily implemented by using simple finite difference scheme and is computationally more efficient. We demonstrate the performance of the proposed algorithm using both simulated and real images and in particular its robustness to the presence of weak boundaries and strong noise. #### ACKNOWLEDGMENT The authors would like to thank Miss.CH. Hima Bindu for discussion on edge techniques of the proposed algorithm, Mr.P.Joga Rao for his contribution in developing the review of DRLSFE. #### REFERENCES - [1] C. Carson, S. Belongie, H. Greenspan, and J. Malik. Blobworld: Image segmentation using expectationmaximization and its application to image querying. IEEE Trans. Pattern Anal. And Machine Intell., 24(8):1026–1038, 2002. - [2] V. Caselles, R. Kimmel, and G. Sapiro, "Geodesic active contours," Int. J. Comput. Vis., vol. 22, no. 1, pp. 61–79, Feb. 1997. - [3] D. Adalsteinsson and J. Sethian, "The fast construction of extension velocities in level set methods," J. Comput. Phys., vol. 148, no. 1, pp.2–22, Jan. 1999. - [4] Distance Regularized Level Set Evolution and Its Application to Image Segmetation Chunming Li, Chenyang Xu, Senior Member, IEEE, Changfeng Gui, and Martin D. Fox, Member, IEEE - [5] C. Li, C. Xu, C. Gui, and M. D. Fox, "Level set evolution without re-initialization: A new variational formulation," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2005, vol. 1, pp. 430–436.j - [6] S. Osher and J. Sethian, "Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations," J.Comput. Phys., vol. 79, no. 1, pp. 12–49, Nov. 1988. - [7] V. Caselles, F. Catte, T. Coll, and F. Dibos, "A geometric model for active contours in image processing", Numer.Math., vol. 66, pp. 1-31, 1993. - [8] Caselles, R. Kimmel, and G. Sapiro, "Geodesic active contours", Int'l J. Comp. Vis., vol. 22, pp. 61-79, 1997. - [9] R. Malladi, J. A. Sethian, and B. C. Vemuri, "Shape modeling with front propagation: a level set approach", IEEE Trans. Patt. Anal. Mach. Intell., vol. 17, pp. 158-175, 1995. - [10] B. B. Kimia, A. Tannenbaum, and S. Zucker, "Shapes, shocks, and deformations I: the components of two dimensional shape and the reaction-diffusion space", Int'l J.Comp. Vis., vol. 15, pp. 189-224, 1995. - [11] T. Chan and L. Vese, "Active contours without edges", IEEE Trans. Imag. Proc., vol. 10, pp. 266-277, 2001. - [12] B. Vemuri and Y. Chen, "Joint image registration and segmentation", Geometric Level Set Methods in Imaging, Vision, and Graphics, Springer, pp. 251-269, 2003. - [13] D. Peng, B. Merriman, S. Osher, H. Zhao, and M. Kang, "A PDE-based fast local level set method", J. Comp. Phys., vol. 155, pp. 410-438, 1999. - [14] M. Sussman and E. Fatemi "An efficient, interfacepreserving level set redistancing algorithm and its application to interfacial incompressible fluid flow", SIAM J. Sci. Comp.,vol. 20, pp. 1165-1191, 1999 - [15] L. Evans Partial Differential Equations, Providence: American Mathematical Society, 1998. - [16] V. I. Arnold, Geometrical Methods in the Theory of Ordinary Differential Equations, New York: Springer-Verlag, 1983. - [17] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989. Electronic Publication: Digital Object Identifiers (DOIs): Article in a journal: - [18] D. Kornack and P. Rakic, "Cell Proliferation without Neurogenesis in Adult Primate Neocortex," Science, vol. 294, Dec. 2001, pp. 2127-2130, doi:10.1126/science.1065467. - [19] H. Goto, Y. Hasegawa, and M. Tanaka, "Efficient Scheduling Focusing on the Duality of MPL Representatives," Proc. IEEE Symp. Computational Intelligence in Scheduling (SCIS 07), IEEE Press, Dec. 2007, pp. 57-64, doi:10.1109/SCIS.2007.357670. ### S - Band Low Noise Amplifier (LNA) Design #### A.Sreenivasan & Bhavana.G. Dayanand Sagar College of Engineering, Bangalore, India E-mail: bhavanahassan@gmail.com Abstract - In this paper, Low Noise Amplifier design (LNA) is discussed. It is designed to operate in S band whose frequency ranges between 2 to 5GHz. LNA is an amplifier which is used to amplify the weak signals. It is placed at the front end of the antenna. LNA plays a significant role in the receiver. It should be designed with less noise figure and large gain so that the overall noise-figure of the receiver becomes very less. AT 41470 is the device used. The reflection coefficient of the device should be matched to $50\Omega$ so that the maximum power transmission takes place. This is done by using matching circuit at the input and output side. Since it is operating at a very high frequency, micro strip lines are used instead of lumped components. Here the need of S parameters, matching circuit is also been discussed. Simulation is done by using ADS software. Keywords-component: LNA; AT 41470; S Parameter; Matching circuit; ADS software #### I. INTRODUCTION LNA is an electronic amplifiers used to amplify the weak signals when the signal is uplinked from the ground station. LNAs are usually placed at the front-end of a receiver system, immediately following the antenna. A band pass filter may be required in front of it if there are many adjacent interfering bands leaking through the antenna, but this filter generally degrades the noise performance of the system. The purpose of an LNA is to boost the desired signal power while adding as little noise and distortion as possible so that retrieval of this signal is possible in the later stages in the system. As per the Friss formula, the overall Noise figure (NF) of the receiver's front end is dominated by the first few stages (or even the first stage only). Using a LNA, the effect of noise from the subsequent stages of the receive chain is reduced by the gain of the LNA, while the noise of the LNA itself is directly injected into the received signal. Thus, it is necessary for a LNA to amplify the signal level by adding as little noise as possible. A good LNA will be having a noise figure of below 2dB and a gain ranging between 25 to 30dB [1]. **Noise:** Noise in electrical systems is defined as random fluctuations in voltage and current. It can be generated internally by components employed in the system or externally by electrical radiation from other systems or induced mechanical vibrations [2]. **Noise Figure (NF):** Noise figure is noise factor in decibel units (dB) and is an important figure of merit used to characterize the performance of not only a single component but also the entire system. Noise factor is defined as the input signal to noise ratio divided by the output signal to noise ratio [2]. Figure 1: Basic Block Diagram of Receiver The overall noise figure of the system is given by the Friss formula, $$F_{total} = F_1 + \frac{F_2 - 1}{G_1} + \frac{F_3 - 1}{G_1 G_2} + \cdots$$ [1] The noise figure of the first stage, $F_1$ , dominates the overall noise performance if the gain of the first stage, $G_1$ is sufficiently high. LNA should have minimum Noise figure, since it is the first stage of the receiver. If the LNA is designed with the minimum noise figure and maximum gain then the overall noise figure of the system can be less. Less noise figure and large gain cannot be achieved simultaneously. For this, two stage LNA is designed. First stage to get the minimum noise figure but here gain will be less. In the second stage LNA is designed to get the maximum gain, but here noise figure will be more compared to the first stage. When cascaded, noise figure will become less and gain will be more [3]. **S Parameters:** The S-parameter (Scattering parameter) expresses device characteristics using the degree of scattering when an AC signal is considered as a wave. The word "scattering" is a general term that refers to reflection back to the source and transmission to other directions. S parameters are a very convenient way of characterizing RF devices. Total current and voltage is very hard to measure in RF devices. Some older parameter (Z, Y and H) require physical shorts or opens. Physical devices could actually self-destruct when connected to a short/open. S-parameters are important in microwave design because they are easier to measure and to work with at high frequencies than other kinds of two port parameters [4]. S-parameters are a useful method for representing a circuit as a "black box". A "black box" or network may have any number of ports. S-parameters are measured by sending a single frequency signal into the network or "black box" and detecting what waves exit from each port. Power, voltage, current are considered as waves travelling in both the directions. For the incident waves on the port 1 some part of the signal reflects back from the port and some portion of the signal will be passed through the port. The below given signal graph gives the interpretation of S parameters in terms of voltage Figure 2: Two Port Network The parameters are defined as: $$S_{11} = b1 / a1 = V_{reflected at port1} / V_{towards port1}$$ , $$\begin{split} S_{12} &= b1 \ / \ a2 = V_{out\ of\ port1} \ / \ V_{towards\ port\ 2}, \ when\ a1 = 0 \\ S_{21} &= b2 \ / \ a1 = V_{out\ of\ port2} \ / \ V_{towards\ port1}, \ when\ a2 = 0 \\ S_{22} &= b2 \ / \ a2 = V_{reflected\ at\ port\ 2} \ / V_{towards\ port\ 2}, \\ &\qquad \qquad when\ a1 = 0\ [5]. \end{split}$$ Matching: Matching is the act of making the source and load impedances matched to achieve the desired amount of power reflected and power transferred. Matching is required if the circuit is to yield optimum gain and return loss. Poorly matched devices can cause large amount of reflected power, poor noise performance, and low gain. For an LNA, power reflected caused by improper input match can travel back to the antenna and be re-radiated. Poor input match can also reduce the gain of the LNA and causes the system to have non-optimum noise performance. Simple matching networks can be designed with the help of the Smith chart, but more complicated ones often require the use of a computer and some type of network synthesis software. Standard input and output impedances of most microwave instruments are matched to $50\Omega$ [5]. The matching circuit for the device is shown in figure 3. Figure 3: Typical LNA Topology Steps to be followed for Matching Procedure in the smith chart - 1. Locate the mismatch. - **2.** Find the wavelength at mismatch. - **3.** Move down the transmission line toward generator until it intersects the unity circle. - Find the no of wavelengths moved down the transmission line. - 5. Add the capacitance. - 6. Determine the capacitance by using the formula: #### II. LNA Design ADS software is used to design low noise amplifiers.AT 41470 is the device used. From the datasheet and as per the requirements given below are the LNA specifications. | Frequency | 2GHz - 5GHZ | |----------------------------|---------------------| | Carrier Tracking Range | ± 125 kHz | | Carrier acquisition range | -135 dBm to -70 dBm | | Noise Figure | <2 db | | Gain | As high as possible | | Source and Load | | | Impedances to be Matched | 50Ω | | Relative Permittivity | 10.2 | | Thickness of the substrate | 17μm | | Height of the substrate | 1.27mm | | Device Used | AT 41470 | Optimum Reflection Coefficient ( $\Gamma_{opt}$ ) for the device (from the datasheet) is $0.21 \perp 160$ [7]. When plotted on smith chart $\Gamma_{opt}$ is not matched to $50\Omega$ , therefore matching circuit should be added at both input and output side. Since LNA operates at S band (GHz), lumped inductors and capacitor becomes difficult. One of the most common matching techniques is to use stubs to provide shunt capacitance or inductance. Shorted stub provides an inductance, whereas an open circuit stub provides a capacitance [5]. ADS software is used to bring the $\Gamma_{opt}$ point on the unity circle of the smith chart by tuning the length of the micro strip lines (stub lines). The device (AT41470) is represented by two port network. For this, first the S parameters file of the device should be attached to the two port network, then matching circuit should be added, then the length of the strip lines should tuned at both input and output side such that $\Gamma_{\rm opt}$ is moved towards the unity circle on the smith chart. By doing this, maximum power gets transmitted; only small portion will be reflected. First, two single stage LNA's are designed, one with less noise figure while the other with large gain. Later these two LNA's are combined to form the two-stage LNA and then tuned little so that the overall noise figure is less and gain is more. After this, biasing circuit for the device should be added. #### Biasing circuit design: From the datasheet [7], $$V_{cc}=15V,\,V_{CE}=8V,\,I_C=10$$ ma, $I_{CBO}=0.2~\mu A,\,V_{BE}=1V,\,h_{FE}=150$ $$I_B = \frac{Ic}{hFe}$$ Assuming $V_{BB} = 10 \% V_{cc} = 1.5V$ $$R_{B} = V_{BB} - V_{BE} / I_{B}$$ $$= 7.5 \text{ K}\Omega$$ Assuming that $$I_{BB} = 5 I_{B}$$ = 333.3 µA $$R_2 = V_{BB} / I_{BB}$$ $$=4.5 \text{ K}\Omega$$ $$R_1 = V_{CE} - V_{BB} / I_B + I_{BB}$$ $$= 16.25 \text{ K}\Omega$$ $$R_{C} = V_{CC} - V_{CE} / I_{C} + I_{BB} + I_{Bv}$$ $$=67.3 \Omega$$ The biasing circuit for the two stages is shown in figure 4. Figure 4: Biasing Circuit Two-stage LNA with matching and biasing circuit is shown figure 5. Figure 5: LNA circuit In the circuit, the above part forms the matching circuit for LNA. The below part forms the biasing circuit for the device. ADS software: Advanced Design System (ADS) is an electronic design automation software system produced by Agilent EE's of EDA. Agilent ADS supports every step of the design process like schematic capture, layout, frequency domain and time domain circuit simulation and electromagnetic simulation, allowing the engineer to fully characterize and optimize an RF design without changing tools [8]. **Procedure:** The below given steps are followed to create the schematic diagram using ADS. - Create the new project then open new schematic window. - On the Schematic Window first add the two network which represents the device - Get MLIN under Tlines-Microstrip category which forms the inductor for the matching circuit. - Then add MLOC which represents open circuit stub. - Then calculate the width of the stub lines by using linecalc. - MLIN and MLOC are connected by using the MTEE element. - Similarly add the required components - Tune the length of the stub lines till the desired result is achieved [9]. #### III. EXPERIMENTAL RESULTS After simulating, Gain of 25dB and Noise figure of 1.984 is obtained and corresponding graphs are shown in figure 6. Figure 6: Simulated Results #### IV. PERFORMANCE ANALYSIS From the graph, it can be seen that gain is around 25db and the return loss is -13dB at the working frequency. Return loss specifies how much power is reflected and as seen it is very less. At image frequency the gain should very less, otherwise information will be distorted. From the Friss formula, the receiver noise figure is $$F_{total} = F_1 + \frac{F_2 - 1}{G_1} + \frac{F_3 - 1}{G_1 G_2} + \cdots$$ The noise figure of the two-stage LNA is 1.9dB.According to Friss formula, first stage gain and noise figure dominates. Since the gain is in denominator, as the gain of other stages increases the noise figure decreases. From the formula, it can be seen that only the noise figure of the first stage dominates. As it can been seen in the graph, the noise figure is very less. From the fig1, LNA output is given to first mixer whose noise figure is about 7dB. Since the noise figure in the further stages decreases, it becomes negligible. Therefore by considering first two stages, Overall Noise figure = 1.97dB It can be seen the overall receiver noise figure is below 2dB. This is possible if the noise figure of LNA is very less. #### V. CONCLUSION The LNA developed for S band frequency is giving gain and noise figure as required. The overall noise figure of the receiver is also less and maximum power transmission takes place. Further this LNA design can be modified for X Band frequency. #### REFERNCES - [1] Prof. S.Long, Design of Low Noise Amplifiers, 2007 - [2] A.Dao, Integrated LNA and Mixer Basics, National Semiconductor, Application Note 884. - [3] Spencer, R.R,"Noise in Electronic Devices, Circuits, and Systems", University of California, Davis, 1991. - [4] Franz Sischka, Basics of S-Parameters (part 1), characterization handbook 1SBASIC1.doc - [5] Allan W.Scott, Understanding Microwaves, 1993 by John Wiley & Sons, INC. publications. - [6] Gonzalez, Guillermo, Microwave Transistor Amplifiers, 1984 by Prentice-Hall, INC. - [7] Datasheet-AT41470 Up to 6 GHz Low Noise Silicon Bipolar Transistor, Hewlett- Packard, http://www.home.agilent.com/. - [8] ADS Code Book for Microwave Amplifiers, Agilent Technologies. - [9] Cynthia Furse, Tutorial for Automated Design System (ADS). ## Efficient Detection of Attacks Using Clusters in Supervisor-based Network Intrusion Detection System #### Deepak Kshirsagar, AnuradhaSaini & Neelam Malik Department Of Computer Engineering and Information Technology, College of Engineering, Pune (COEP), Shivajinagar, Pune, Maharashtra, India. E-Mail: {kdeepak83, sainianu08, neelammalik77} @gmail.com Abstract- A Network intrusion detection system (IDS) is a device or software application that monitors network and/or system activities for malicious activities or policy violations and produces reports to a Management Station. It is an important tool for information security. Nearly many Existing commercial NIDS products are signature-based or classifier-Based with strong disadvantage of not being adaptive. Our paper proposes an Adaptive and efficient NIDS using clustering approach of Data mining. Every network traffic have a definite behaviour that is precisely captured using Data mining approaches, Intrusion detection is used in the networks by comparing the set of baselines of the system with the present behavior of the system [3]. Thus, a basic assumption is that the normal and abnormal behaviors of the system can be characterized and thus excavates the difference between "normal" and "attack" traffic. Current researches comprise of single engine detection systems, whereas our proposed system is constructed by a number of Supervisors, which are totally different in both training and detecting processes. Using clustering algorithm, respective type of packets is clustered under respective Supervisors formed after clustering. Each of the Supervisors captures a specific network behaviour type and hence the system has strength on detecting different types of attacks as well as ability of detecting new types of attacks. The results show that the network traffic pattern used as unswerving supervisors which is more efficient from traditional signature-based NIDS. Keywords-clusters; supervisor; NIDS; detection; attack. #### I. INTRODUCTION Intrusion Detection System is an important technology in business sector as well as active area of research. It is an important tool for information security. Security of network systems is becoming increasingly important as more and more sensitive information is being stored and manipulated online. Network Intrusion Detection Systems (NIDS) have thus become a critical technology to help protect these systems. Network Intrusion Detection System (NIDS) will be another wall for protection. Most of the existing commercial NIDS are signature-based but not adaptive. They can also be generated in a quicker and more automated method than manually encoded models that require difficult analysis of audit data by domain expertsThe major problem with signature-based approach is that these IDSs fail to generalize to detect new attacks or attacks without known signaturesThe unsupervised anomaly detection approach have a problem by making use of data clustering algorithms, which makes no assumption about the labels or classes of the patterns. Unlike the other two methods, these approaches can detect new emerging threats The advantage of the supervised methods for anomaly detection is that the labelling procedure. of the training data is helpful in detecting attacks directly by applying the rules to already classified and labelled data without scanning and verifying the validation of the rules to all packet, even those that are not related to these rules. Hence it is fast for real time continuously flowing network traffic for detection of newer and older attack. The proposed NIDS combines the efficiency of both signature based and anomaly based constructed by different types of supervisor [1]. There are three types of supervisor based on clustering depending on types of packet. The normal behaviour of a network can be profiled and anomaly traffic can easily be detected with the present of network portfolio. In addition, it can adopt the changes of network automatically with the adaptive learning of supervisors. This paper is organized into five sections. Section 2 contains a discussion on related works while the proposed supervisor-based NIDS is introduced in section 3. Results are reported in section 4 and the final section concludes the paper. #### II. RELATED WORK Earlier NIDSs were mostly using signature based approaches and therefore were able to find only old attacks and were unable to find new types of attack that come every .Intrusion Detection System (IDS) needs to be updated due to new attack methods or upgraded computing environments. Since many current IDSs are constructed by manual encoding of expert security knowledge, changes to IDSs are expensive and slow. In this paper, we describe a clustering technology which is a data mining approach for adaptively building Intrusion Detection (ID) models. The central idea is that the anomaly based clusters detect new attacks by utilize auditing programs to extract an extensive set of features that describe each network connection or host session, and apply clustering method to learn rules that accurately capture the behaviour of intrusions and normal activities. Therefore, many researchers have proposed and implemented different intrusion detection models based on data mining techniques to tackle this problem. In this section, a brief review on current works is given. According to [1]the basic elements of an IDS is the audit log that captures the system activity. The following layers of operation can be easily identied:1) Operating System: The logs in this layer contain information from the kernel and other operating system components and help determine if an attacker is trying to compromise the OS.At the network layer, communication data is analyzed to determine if an attacker is trying to access one's network. Examples of IDSs that operate on this layer include NADIR (Network Audit Director and Intrusion Reporter) [2],2)Application: Application level IDSs examine the operations executed in an application to ascertain if the application is being manipulated to extract behavior that is prohibited example is Janus [3]. Over the past ten intrusion detection and other security technologies such as cryptography, authentication, and recalls have increasingly gained in importance [4] However, intrusion detection is not yet a perfect technology [5]. This has given data mining the opportunity to make several important contributions to the old of intrusion detection [6]have proposed an approachwhere an adaptive IDS which defines a set of data dependency rule sets based on changing access roles which are maintained in a repository to identify such malicious transactions. Whereas in [7], the author talks about a framework for continuously adapting the intrusion detection system for a computer environment as it is upgraded..NIDS need to be accurate, adaptive, and extensible.[8] presents the features of signature based NIDS in addition to the current state-of-the-art of Data Mining based NIDS approaches. But there are three types of issues in NIDS accuracy, efficiency, and usability [9] tries to improve all of them.By using classifier as the data mining approach.[10] have described an incremental technique that uses association rules and classification techniques to detect attacks. It does not use the entire data set to mine rules. The rules are categorized according to the time of the day and day of the week. An incremental on-line algorithm is used to detect rules that receive strong support during a sliding window of pre-determined size [11] apply one of the efficient data mining algorithms called naïve bayes for based network intrusion detection. anomaly Experimental results on the KDD cup'99 data set show the novelty of this approach in detecting network intrusion is very high. NIDS need to be accurate, adaptive, and extensible [12,13] have developed an IDS with an overview on two general data mining algorithms that have been implemented: association rules [14,15] and frequent episodes [16]. With new types of attacks appearing continually, developing flexible adaptivethere are still problems that are to be resolved in order to provide better security to new types of attack. [17] Combine in designed an effective anomaly-based IDS using techniques of data mining and expert systems. Combining methods may give better coverage, and make the detection more effective.. [18] Incorporates Detection models for new intrusions or specific components of a network system into an existing IDS through a meta-learning (or co-operative learning) process, which produces a meta detection model that combines evidence from multiple models Mining Audit Data for Automated Models for Intrusion Detection in [19]. [20] Proceedings of introduces a new type of clustering-based algorithm for unsupervised anomaly NIDS, which trains on unlabelled data in order to detect new intrusionssets. [21] Here signature discovery in NIDS. Is supported using data mining based approach .Furthermore, [22] discusses outlier detection algorithms used in data mining systems. In this paper, an adaptive NIDS based on various clustering techniques is proposed. However, unlike most of the current researches, which only one engine is used for detection of various attacks; the proposed system is constructed by a number of supervisors, which are totally different in both training and detection processes along with the advantage of signature based method. Our system utilizes the advantage of both methods to detect new and old attacks. #### III. PROPOSED DESIGN OF SUPERVISOR-BASED NIDS The proposed NIDS is composed of three modules, Trainer, Clustering based Detection Engine (Supervisor-Based Agent), Rule Database. #### A. System architecture Data Mining Approaches can be used for implementing Clustering techniques for the formation of respective Supervisors. Clustering of respective type of packets under respective Supervisors which will be formed. This model is supposed to improve efficiency. The clustering algorithm is an explorative data mining and its function is as described by the name itself. The algorithm clusters packets with similar network behavior and puts into the same cluster while dissimilar ones fall into separate. It then assigns each observation to clusters based upon the observation's proximity to a cluster's partitioning requirement. - A partitioning method classifies the data into k groups, which together satisfy the requirements of a partition: - each group must contain at least one packet. - · each packet must belong to exactly one group - This implies k <= n - k is given by user ( a parameter hard to determine at the beginning) Thereby, finding a partition of the dataset minimizes the value of the measurement function from the given desired number of cluster k and a dataset of n points, and a distance-based measurement function. Figure 1 shows the overall system architecture. Figure 1.Architecture of Supervisor-based NIDS #### 1) Features Extraction The required features are extracted from the captured packets and classified as corresponding functions for each kind of statistics, and hence, there is high flexibility. Currently, the system supports the following extracted packet features. - Source port - 2. Destination port - 3. Time To Live (TTL) - 4. Window Size - 5. Packet Length - 6. Number of packets in a packet - 7. Threshold count of 'set' specific flag bit - Number of connection attempted to open in a packet etc. The more the packet features are extracted, the more system becomes. #### 2) Supervisor Based Detection Engine It consists of Feature Distributer, Respective feature Supervisors and Decision Makerthat produces the Alarm. - Feature Distributer: Clustering of packets under respective Supervisors is done by analysing packet features and based on the selected packet feature, - Supervisor: Based on the selected packet features, specific Supervisors are designed. Each Supervisor is formed performing calculations using a particular threshold value to find how much vary of a candidate cluster from normalis. For example, based on the packet protocol type, the respective packets are clustered under the corresponding Supervisors. - Decision Maker: If the distance is larger than a threshold, the cluster will be regarded as an intrusion, or vice versa. Figure 2. Architecture of the Clustering-Based Detection Engine Trainer updates the clustering-based detection engine with the new rules added. A Feature Distributor assignsnecessary feature vectors to the trainer. Each training rule is built in a corresponding data mining approach and updates the corresponding clusters. #### 3) Trainer It profiles normal behaviour of the network is profiled and reforms Corresponding Clusters in correspondence with the respective rules. Supervisor-Based detection engine is based on normal network traffic behaviour only and deviations from it. The adaptive ability of this model to the environment is expected in higher than its false alarm rate since legitimate packets may be filtered as attacks. Figure 3 shows the structure of a Trainer. Figure 3. Architecture of the Trainer #### B. Clustering Algorithm in Anomaly Detection Clustering-Based Detection Engine. After recording thenetwork traffic behaviour pattern by taking input vectors as network packets, the normal net work traffic behaviour in isolated clusters is collected during training. When abnormalnetwork behaviour is encountered while clustering, its traffic pattern is compared to the respective threshold trained clusters. If the deviation from the normal threshold is too large as compared to the normal clusters, it is suspected to be as abnormal behaviour thereby raising an attack alarm as a result of detection of a potential attack, or vice versa. 1) Feature selection. Only the required feature sets fordifferent clusters are shown in Table 1.The features selected are specifying for the quantitybased attacks such as TCP syn flood attack and denial of service as listed below. TABLE I .FEATURE SETS FOR CLUSTERED SUPERVISORS | Supervisor<br>TCP | Supervisor<br>UDP | Supervisor<br>ARP | | |-------------------|-------------------|-------------------|--| | Number of | Number of | Number of | | | Unique ports | Unique ports | Unique ports | | | accessed | accessed | accessed | | | Mean Packet | Mean Packet | Mean Packet | | | Size | Size | Size | | | Source port | Source port | Source port | | | Destination | Destination | Destination | | | port | port | port | | | Window size | Window size | Window size | | | Time To Live | Time To Live | Time To Live | |--------------|--------------|--------------| - 2) Computation of threshold feature vector of each Cluster-In our proposed Supervisor-Based NIDS, consider the threshold unit as the mean of each feature set selected which is eventually used for the detection of abnormal variation from the normal traffic pattern - 3) Training phase forming and updating of clusters. The newly arrived packets are trained and divided into clusters depending upon the previous traffic pattern records. - 4) Decision Maker-It is based on how farthe variation is from the threshold unit of a candidate cluster fromnormal. If it is larger than the threshold, the cluster will be regarded as an intrusion, or vice versa. #### IV. EXPERIMENTS Examinations of the proposed NIDS for testing and analysing the efficiency are performed, and also, various` types of attack are tested to evaluate the strengths and limitation of each Supervisor. Figure 4: Efficiency of Detection #### A. Experiment Parameters Incoming network packets are captured for recording the normal network behaviour pattern. In Supervisor-based detection engine, clustering approach was adopted. Threshold value is initialised to 0 and maximum loop count is 100.For the representation of data, the combinations of supervisors are represented in Table 2. TABLE II.CORRESPONDING SUPERVISORS WITH THE ASSIGNED RULE. | | Cluster of TCP packets using k- | | |------------|---------------------------------|--| | TCP | means algorithm | | | SUPERVISOR | | | | | Cluster of UDP packets using | | | UDP | k-means algorithm | | | SUPERVISOR | | | | | Cluster of ARP packets using | | | ARP | k-means algorithm | | | SUPERVISOR | | | | |------------|-------------------------------|--|--| | | Recorded pattern by TCP | | | | RULE 1 | supervisor for attack traffic | | | | | Recorded pattern by UDP | | | | RULE 2 | supervisor for attack traffic | | | Table II shows the supervisors formed by Sample clustering algorithmwith 'protocol type' as the selected feature. Each Supervisor analyses the incoming real time packets for abnormal deviation from value of the defined threshold unit of cluster. If a significant deviation is observed, the respective traffic pattern is used to devise a rule for the specific attack and also the rules formed are then trained to the Rule Database .This Rule database is then referred by the supervisor-based detection engine for finding intrusions in the incoming network traffic. Hence, updating of the Rule database reduces the time complexity for detecting newly prevailing attacks and improves the efficiency for detecting zero type attacks i.e. new attacks .Our proposed architecture integrates the strengths of both Rule-based and clustering based supervisors thereby improving the efficiency of detection of the old as well as the new attacks and with reduced time overheads. #### B. Results Since, Rule-based and signature based NIDS can detect only known type of attacks, it is incapable of detecting new type of attacks. In today's world new attacks are developing at a fast pace, thereby posing new threats. Each clustering-based supervisor performing Anomaly detection, examines specific kind of traffic; hence there is faster detection rate in each supervisor and lower false alarm rate from certain supervisor. At the decision maker, a higher detection rate can be improved by using the Rules applied. The benchmarking traditional Rule-based NIDS prove to be lower in detection rate for the attack type without signature, while the proposed NIDS based on normal traffic has high potential on capturing "new" attack by combining the clustering approach in supervisors doing anomaly based detection. TABLE III.COMPARISON OF CORRESPONDING SUPERVISORS WITH THE ASSIGNED RULES | ATTACK | SUPERVISOR | | RULES | RULES | | |--------------|------------|---------|-------|-------|-------| | | TCP | UD<br>P | ARP | RULE1 | RULE2 | | TCP<br>LAND | 98.4 | 0.0 | 0.0 | 98.25 | 0.0 | | DOS | 99.1 | 0.0 | 0.0 | 99 | 0.0 | | UDP<br>FLOOD | | 98.6 | 0.0 | 0.0 | 98.3 | | TCP SYN | 99.6 | 0.0 | 0.0 | 95.5 | 0.0 | The efficiency is analysed using Detection rates and false alarm rates. Hence, using the average of both the supervisors designated on the basis of clustering based detection supervisors and signature based detection supervisors, the rate of accurately finding the attacks increases since the attacks are first tested with the signature based detection and then the remaining unsuspected packets to the clustering based supervisors. Figure 5: Comparison of Detection Rates of Traditional NIDS with proposed Clustering-Based NIDS. Figure 6: Comparison of Detection Rates of Traditional NIDS with proposed Clustering-Based NIDS. #### V. CONCLUSION Earlier NIDS products were signature-based with many disadvantage of lack of detection of newly created attacks Than there were classifier-Based approaches with another set of problems and overhead of classifying and also not being adaptive. Our paper proposes an Adaptive and efficient NIDS using clustering approach of Data mining. Every network traffic have a definite behaviour that is precisely captured using Data mining approaches, Intrusion detection is used in the networks by comparing the set of baselines of the system with the present behavior of the system [3]. Our proposed system is constructed by a number of Supervisors, which are totally different in both training and detecting processes and has strength on detecting different types of attacks as well as ability of detecting new types of attacks. The results show that the network traffic pattern used as unswerving supervisors which is more efficent from signature-based NIDS.For future traditional development, the following directions are proposed: (i) to develop more supervisors which are strength on other aspects, (ii) to set the thresholds by the system with no human interference and also (iii) to introduce incremental updating mechanism for the detection supervisors. #### REFERENCES - TyroneGrandison and EvimariaTerzi "Intrusion Detection Technology" IBM Almaden Research Center 650 Harry Road, San Jose, CA 95120 ftyroneg,eterzig@us.ibm.com September 7, 2007 - [2] J. Hochberg, J. Jackson, C. Stallings, J.F. McClary, D. Dubois, and J. Ford. Nadir: an automated system for detecting network intrusion and misuse. Computer Security,12(3):235 {248, May 1993. - [3] Ian Goldberg, David Wagner, Randi Thomans, and Eric Brewer. A secure environment for untrusted helper applications (con ning the wily hacker). In Sixth USENIX Security Symposium, 1996. - [4] Allen, J., Christie, A., Fithen, W., McHugh, J., Pickel, J., and Stoner, E. (2000). State of the Practice of Intrusion Detection Technologies. Technical report, Carnegie Mellon University. - [5] Lippmann, R. P., Fried, D. J., Graf, I., Haines, J. W., Kendall, K. R., McClung, D., Weber, D., Webster, S. E., Wyschogrod, D., Cunningham, R. K., and Zissman, M. A. (2000). Evaluating Intrusion Detection System. - [6] Sujaa Rani Mohan, E.K. Park, Yijie Han "An Adaptive Intrusion Detection System using a Data Mining Approach" University of Missouri, Kansas City - [7] W. Lee, SJ Stolfo, KW Mok, "Data mining approaches for intrusion detection", Proceedings of the 7th USENIX Security Symposium, 1998. - [8] A Survey Rasha G. Mohammed Helali Srinivasan1, Jagjit Singh, Vivek Kumar "Data Mining Based Network Intrusion Detection System" IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 4, No 2, July 2001. - [9] Wenke Lee, Salvatore J. Stolfo\_, Philip K. Chan, EleazarEskin , Wei Fan, Matthew Miller, ShlomoHershkop and JunxinZhang" Real Time Data Mining-based Intrusion Detection" Computer Science Department, North Carolina State University, Raleigh, NC 27695. - [10] D. Barbara, S. Jajodia, and N. Wu, Mining unexpected rules in network audit trails, Personal communications, 2000. - [11] Mrutyunjaya Panda1 and ManasRanjanPatra "Network Intrusion Detection using naïve'sbayesijcsns International Journal of - Computer Science and Network Security, VOL.7 No.12, December 2007. - [12] G.W. Lee and S. J. Stolfo, "Data Mining Approaches for Intrusion Detection," In Proceedings of the Seventh USENIX Security Symposium (SECURITY '98), San Antonio, TX, January 1998. - [13] W. Lee, S. Stolfo and K. Mok, "Mining Audit Data to Build Intrusion Detection Models," In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD '98), New York, NY, August 1998. - [14] R. Agrawal, T. Imielinski and A. Swami, "Mining Associations between Sets of Items in Massive Databases," In Proceedings of the ACM-SIGMOD Int'l Conference on Management of Data, Washington D.C., pp. 207-216, May 1993. - [15] R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," In Proceedings of the 20th Int'l Conference on Very Large Databases (VLDB94), Santiago, Chile, September. 1994. - [16] H. Mannila, H. Toivonen and A. I. Verkamo, "Discovery of Frequent Episodes in Event Sequences," Data Mining and Knowledge Discovery 1(3), pp. 259-289, 1997. - [17] A.S. Sodiya," A new two-tiered strategy to intrusion detection". (University of Agriculture, Abeokuta, Ogun State, Nigeria), H.O.D. Longe, (University of Agriculture, Abeokuta, Ogun State, Nigeria), A.T. Akinwale, - [18] Wenke Lee Salvatore J. StolfoKui W. Mok "] A Data Mining Framework for Building Intrusion Detection Models1"Computer Science Department, Columbia University500 West 120th Street, New York, NY 10027. - [19] W. Lee and S. J. Stolfo, "A framework for constructing features and models for network intrusion detection systems," ACM Transactions on Information and System Security (TISSEC), Vol. 3, Issue 4, pp. 227-261, November 2000. - [21] S. Stani ford Chen, S. Cheung, R. Crawford, M. Dilger, J. Frank, J. Hoagland, K Levitt,C. Wee, R. Yip, and D. Zerkle. "GrIDS A graph based intrusion detection system for large networks". In Proceedings of the 19th National Information Systems Security Conference, 1996 - [22] H. Han, X. L. Lu, J. Lu, C. Bo, and R. L. Yong, "Data mining aided signature discovery in network-based network intrusion detection system," ACM SIGOPS Operating Systems Review, Vol. 36, No. 4, pp. 7-13, October 2002. - [23] M. I. Petrovskiy, "Outlier Detection Algorithms in Data Mining Systems," Source Programming and Computing Software, Vol. 29, Issue 4, pp. 228 237, J ## **Application of Wireless Sensor Networks In Farming** #### Priyanandhan. R HCL Technologies, Chennai E-Mail: priyanrece@gmail.com Abstract - This paper proposes an idea for effective monitoring and controlling of crops using Wireless Sensor Networks. With the help of wireless sensor nodes, we transit the information to the gateway node from the sensor node using 6LoWPAN protocol through compressed internet protocol version 6(IPv6) packets. From the gateway node, the information is sent to the integrated system via 3G. The information is processed in the integrated system and can be viewed in a LCD screen. There is a console to operate the motor and spray fertilizers. **KEYWORDS-** Wireless Sensor Networks (WSN), Internet Protocol Version 6 (IPv6), 3G Networks, Sensor Nodes and Gateway Nodes. #### I. INTRODUCTION Agriculture is the backbone of any nation and for a country like India, agriculture is the prime occupation. In this modern age where technology has revolutionised the world, farmers must also be equipped with state-ofthe-art devices. Farmers walk for a few kilometres in hostile conditions to their farms to switch on their motors, instead they can switch it on with a press of a button from their homes. The fertilizers can be sprinkled in the farmland, again with help of a button. Temperature sensor is placed in the farm and it monitors the temperature level of the soil. If the temperature goes above the threshold temperature, there will be an alarm and red light indication. The farmer can decide to switch on the motor. If the temperature falls down the threshold value, the farmer will be notified by an alarm and red light indication. The pH sensor is placed in the farm, so that the farmer gets information about soil pH which appears on the LCD screen. By monitoring the critical parameters like Temperature, and soil pH, the farmer goes to the field only when real manual work is needed and can get the best crop output per unit land of hectare. #### II. WIRELESS SENSOR NETWORKS Wireless Sensor Networks are used because, measurement parameters like Temperature and Soil pH are to be monitored and controlled. Wiring poses safety hazards and labour wages are also expensive. Also, accurate monitoring can be achieved by use of WSN. Thus, a Wireless Sensor Network technology would be the best alternative for distributed data collection and monitoring for agricultural farms. The feature of Wireless Sensor Networks is that the sensors nodes for collecting data are very small in size and many sensors can be mounted. The base station consists of a processor in the form of an integrated device for collection and processing of data and displaying it in a LCD screen. The maintenance of WSN is very cheap and easy. Also, WSN is very fast and easily re-locatable. The objective of our system is that, it should monitor and ensure that the necessary variables are within the required range like Temperature and soil pH. The sensors should easily be re-locatable and a base station is needed to process the data collected. #### III. WIRELESS COMMUNICATION Wireless communication is required between the sensor nodes and base station, therefore batteries should be used to power the nodes. 6LoWPAN protocol can be used because of low power consumption and small amount of data is transmitted. The 6LoWPAN enables transmission of compressed Internet Protocol version 6 (IPv6) packets over IEEE 802.15.4 networks. #### IV. DESIGN AND DEVELOPMENT The conditions for design are to increase the quality and productivity of the plant growth. The farmer will be able to set reference values for Temperature and soil pH. It should be able to adjust when the need arises, such as when a different type of crop is grown. It can be easily done in software. Also provide mobility of the measurement points. Due to change in climatic conditions, crop rotation can be accommodated easily by changing the reference variables. The system is also fast, easy and cheap to install. Fig.1 Overall Design Setup The Development phase involves, wireless sensor nodes, base station and actuators. The measured parameter values are converted from analog to digital at the node. The data is then compressed into packets and transmitted over IEEE 802.15.4 networks directly to the gateway node which is connected with integrated system. The integrated system performs data collection and control, and also displays the climate variable values and statistics to the user on the screen. #### V. ARCHITECTURE Five nodes can be placed in the farm. The range of each node depends upon the quality of the nodes purchased. Each of these nodes measures the variables and communicates directly with the gateway node. Each node wakes for 45 seconds then goes back to sleep for 10 minutes 15 seconds. At any time only one node read data from sensors and await data request from gateway node. All communications are routed through the gateway node. Fig.2 Sensors installed in the crop field Node 1 is placed on the left hand top corner. Node 2 is placed on the left hand bottom corner. Node 3 is placed in the right hand top corner. Node 4 is placed on the right hand bottom corner. Node 5 is placed in the center of the field. The devices used are based on 6LoWPAN protocol which enables transmission of compressed internet Protocol Version 6 (IPv6) packets over IEEE 802.15.4 networks. The 6LoWPAN protocol format defines how IPv6 communication is carried in 802.15.4 frames and specifies the adaptation layer's key elements. Fig.3 Star topology WSN A star topology WSN can be used in our system. The star network is connected to the base station. Gateway node acts as a coordinator and received the measured data from the sensor nodes. An integrated system was connected is to be connected to the gateway node to collect and analyze the data. #### VI. PROCESS IN INTEGRATED SYSTEM Fig.4 Integrated System The values of both the Temperature and pH will be displayed in this device. The Temperature values, T1, T2, T3, T4, and T5 values are compared and we display the Highest Value, Lowest value and the Average value. Similarly The pH values, P1, P2, P3, P4, P5 values are compared and we display the Highest Value, Lowest value and the Average value. Fig.5 Temperature Sensor Values comparison with Temperature Threshold range The Temperature values, T1, T2, T3, T4, and T5 values are compared with the Threshold range for the particular crop. If values of T1, T2, T3, T4, and T5 are greater than the threshold value, then alarm rings and there is red light indication. This alerts the farmer and he/she can turn ON/OFF the knob to switch on the Motor. When the condition is failed, then alarm rings and there is red light indication. The farmer then takes necessary action. Fig.6 Pressure Sensor Values comparison with Pressure Threshold range The pH values, P1, P2, P3, P4, and P5 values are compared with the threshold range for the particular crop. If values of P1, P2, P3, P4, and P5 are lesser than the Threshold value, then alarm rings and there is Red light indication. This alerts the farmer and he/she can turn ON/OFF the knob for spraying fertilizers. When the condition is failed, then alarm rings and there is red light indication. The farmer then takes necessary action. #### VII. HARDWARE AND SOFTWARE Hardware used is found at sensor nodes and base station. Sensor Nodes which can be used are Advanticsys CM5000 sensor node. Base Station consists of a Processor which contains an integrated system with LCD Display. The CM5000 mote is IEEE 802.15.4 compliant wireless sensor node based on the original open-source "TelosB" platform design developed and published by the University of California, Berkeley. The mote has the following general characteristic; IEEE 802.15.4 WSN platform, TI MSP430F1611 Microcontroller, TI CC2420 Radio Transceiver, TinyOS 2.x Compatible, Temperature, Humidity sensors, User & Reset Button, 3xLeds,USB Interface and 2xAA Battery Holder. Fig.7 Sensor Node Microcontroller which can be used is of MSP430 family. The MSP430F1611 is a microcontroller with two built-in 16-bit timers, a fast 12-bit A/D converter, dual 12-bit D/A converter, two universal serial synchronous/asynchronous communication interfaces (USART), DMA, and 48 I/O pins. Transceiver which can be used is CC2420, features are True single-chip 2.4 GHz IEEE 802.15.4 compliant RF transceiver with baseband modem and MAC support, DSSS baseband modem with 2 MChips and 250 kbps effective data rate, Suitable for both RFD and FFD operation, Low current consumption (RX: 19.7 mA, TX: 17.4 mA), Low supply voltage (2.1 - 3.6 V) with integrated voltage regulator, Programmable output power and no external RF switch / filter needed. #### **Sensor Boards** In the Sensor board, sensors are soldered to the board. They are equipped with all necessary resistors, capacitors and operational amplifier. It plugs to the CM5000 through the I/O pins. #### **Temperature Sensor** SHT75 digital humidity and temperature sensor is the high-quality version of the pin-type humidity sensor series with cutting edge accuracy. It's a single chip, Temperature and Humidity Sensor. Its features are, ultra-low-power consumption, auto power down, fully calibrated, digital output with superior signal quality, small size, Temperature range of -40 to $125\ ^{\circ}\mathrm{C}$ . #### pH Sensor We can use the Vernier pH Sensor, PH-BTA. Its features are, it is made of sealed, gel-filled, epoxy body, Ag/AgCl, the response time is 90% of final reading in 1 second, temperature range is 5 to 80°C, Range of pH 0–14, resolution: 0.005 pH units Isopotential pH: pH 7 (point at which temperature has no effect) and output: 59.2 mV/pH at 25°C. #### **Gateway Node** It is similar to the sensor node with the exception of the sensor board. It comprises CM5000 sensor node and a USB serial adapter board. Adapter board is used to connect to the base station using a USB cable. Its function's are, maintains overall network knowledge, serves as coordinator node, router node where all data to and from sensor node pass through. It needs memory and computing power. It requires memory and computing power. #### **Base Station** It is made up of the integrated system which collects data from the sensor nodes through the gateway node. It performs the control calculations. It displays the Temperature values and pH values on the LCD screen for the user. The Temperature and pH values are compared and the integrated system processes the new values for Temperature and pH every 10 minutes and 15 seconds. The base station displays the information on the LCD screen. The farmer determines how long each actuator is turned on. #### **Operating System: Tiny OS** TinyOS is a free and open source component-based operating system and platform targeting wireless sensor networks (WSNs). TinyOS is an embedded operating system written in the nesC programming language as a set of cooperating tasks and processes. It is intended to be incorporated into smartdust. TinyOS started as collaboration between the University of California, Berkeley in co-operation with Intel Research and Crossbow Technology. # VIII.CONCLUSION AND FUTURE DEVELOPMENT In this prospective paper, we are currently planning to implement and reduce the cost so that it is feasible and affordable. Also the authenticity of data and detection of failure of nodes can be empowered. Future developments can be made my monitoring and controlling other parameters like humidity, atmospheric pressure. Also a solution to spray pesticides can be incorporated. 2G can also be used if 3G is not feasible or if it's expensive. Since low information data is transmitted, 2G can also be incorporated. The alert message can also be sent to the farmer via SMS using GSM and also a voice call service to their mobiles. Also, the Motor can be made to switch on using wireless technology instead of using wired electric lines. This remote monitoring using Wireless Sensor Networks enables the farmers to save time and energy by walking into the fields. This enables the farmers to grow more crops and to monitor their growth with ease. This leads to more productivity of output which is of profit to the farmer, in turn the society. #### REFERENCES - [1] 'Greenhouse Monitoring with Wireless Sensor Network' by Teemu Ahonen, Reino Virrankoski, Mohammed Elmusrat, University of Vaasa, Mechtronic and Embedded Systems and Applications, 2008. MESA 2008. IEEE/ASME International Conference on 12-15 Oct. 2008. - [2] 'Soil calcium and pH Monitoring Sensor System' by Sherlan g. Lemos, Ana Rita a. Nogueira, Andreä Torre-neto, Aleix Parra and Julian Alonso in journal of agricultural and food chemistry - [3] "On the go mapping of soil properties using ion selective electrodes" by Eric Lund, Viacheslav Adamchuk, Mark Morgan and Achim Dobermann, ECPA conference, June 15-19, 2003. - [4] 'Preliminary design for crop monitoring involving water and fertilizer conservation using wireless sensor networks' by Vijayakumar, S., Rosario, J.N., Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on 27-29 May 2011. - [5] "Wireless sensor networks in precision agriculture" by Aline Baggio, Delft University of Technology. - [6] 'A Proposal Of Greenhouse Control Using Wireless Sensor Networks' by Luciano Gondal, Carlos Eduardo Cugnasca Computers In Agriculture And Natural Resources, 4th World Congress Conference, Proceedings Of The 24-26 July 2006 (Orlando, Florida USA) - [7] Transforming Agriculture Through Pervasive Wireless Sensor Networks By Wark, T., Corke, P., Sikka, P., Klingbeil, L., Ying Guo, Crossman, C., Valencia, P., Swain, D., Bishop-Hurley, G., Pervasive Computing, IEEE, April-June 2007. - [8] http://en.wikipedia.org/wiki/Sensor\_node - [9] Pervasive Computing, IEEE, Volume: 6, Issue: 2 April-June 2007 by Wark, T; Corke, P; Sikka,; Klingbeil, L; Ying Guo, Crossman, C.; Valencia, P.; Swain, D.; Bishop-Hurley,G. [10] 'Regional and On-Farm Wireless Sensor Networks for Agricultural Systems in Eastern Washington' By F.J. Pierce And T.V. Elliott, Computers and Electronics in Agriculture, April 2008. ### Design and Implementation of Multi Access Controller for Multi core Processors Deepa. M.G [1], Anitha devi .M.D [2] Dept of Electronics & communication Engineering, SSIT, Tumkur E-mail: [1] deepaec.mg@gmail.com[2] anithamd\_sonu@yahoo.co.in Abstract - Multi-core processors are about to conquer embedded systems — it is not the question of whether they are coming but how the architectures of the microcontrollers should look with respect to the strict requirements in the field. We present the step from one to multiple cores in this paper, establishing coherence and consistency for different types of shared memory by hardware means. Also support for point-to-point synchronization between the processor cores is realized implementing different hardware barriers. The practical examinations focus on the logical first step from single- to dual-core systems, using an FPGA-development board with two hard PowerPC processor cores. Best- and worst-case results, together with intensive benchmarking of all synchronization primitives implemented, show the expected superiority of the hardware solutions. It is also shown that dual-ported memory outperforms single-ported memory if the multiple cores use inherent parallelism by locking shared memory more intelligently using an address-sensitive method. Keywords- On chip synchronization, Hardware approach, MACtrl. #### I. INTRODUCTION Electronic architectures face a continuous increase in functionality which requires additional memory and computational power. Due to the stringent environmental conditions to be fulfilled, increasing the core frequency of the single-core embedded processor cores is much more limited than in other fields. As the road maps of leading semiconductor companies denote, multi-core alternatives for embedded applications are about to be introduced [2]. The advent of parallelism is a renowned topic in the field of computer science and there are enough examples that show how parallelism in all its undeniable benefit — introduces new kinds of problems which, unfortunately, are nontrivial in the majority. However, with ever increasing miniaturization, introducing parallelism is a natural next step in the evolution of any microprocessor architecture (e.g. the UltraSPARCI and the dual-core 64b UltraSPARC [3]). Conventional parallel architectures have to be judged with respect to their applicability. Since the issue here concerns also safety-critical systems(especially automotive and aeronautics), cost will or should not be the sole criterion in evaluating promising solutions. Changing from a single to multiple processor cores is not without pitfalls and requires prudence. Synchronization is the main topic that must be addressed. Data synchronization prevents data from being invalidated by parallel access whereas event synchronization coordinates concurrent execution. One common mechanism to achieve data synchronization is a lock. Event synchronization forces processes to join at a certain point of execution. Barriers can be used to separate distinct phases of computation and are normally implemented without special hardware using locks and shared memory [4]. An involved process enters the barrier, waits for the other processes and then all processes leave the barrier together. Waiting can be of type busy-waiting or blocking, whereas locking by busy-waiting is not a preferred locking technique. This common belief is challenged recently [5], but not regarding embedded but database systems. In [6] synchronization primitives are analyzed regarding the amount of energy consumption of busy-waiting vs. blocking methods. In this paper blocking hardware solutions ensuring synchronization for a multiple number of processor cores are presented and compared to pure spinning software solutions. #### II. RELATED WORK Locks are implemented in hardware in the CRAY X-MP [7]: a limited set of lock registers is shared by the processors and are assigned to certain processes by the operating system. In [8] the architecture of an early RISC-based multiprocessor is described. Each processor has a fixed number of channels to send data to the other processors, some bytes can be sent on a channel without blocking the sending processor. Here support of the compiler is needed to coordinate the execution of the processes on the different processors. synchronization primitive's locks, barriers and lock-free data structures are the focus of attention in [9]. The classical implementations of those primitives are compared against hybrid synchronization primitives that use hardware support and the caches to improve efficiency and scalability, yielding promising results that seem to justify hardware acceleration. In the specialized multi-core architecture described in [10] a DSP-, RISCand VLIW-core are connected by a 64-bit AMBA AHB bus. For fast synchronization each pair of the three cores share dual-ported memory on-chip. Caching is not done for the on-chip but for the off-chip memory (SDRAM). The work in [10] shows some relevance regarding the hardware configurations used and described in this paper. In an analysis of how to provide an efficient synchronization by barriers on a shared memory multiprocessor with a shared multi-access bus interconnection is described. An innovative, perhaps unorthodox, alternative to ordinary barriers is given in the waiting of a thread is forced by continuous invalidation of the respective instruction cache. An example of global event synchronization across parallel processors using a barrier support library is given in, a compiler is needed to produce the parallelized binary code. Besides the performance overhead due to waiting, barriers also have significant power consumption as disadvantage. Different barrier implementations for many-core architectures are analyzed in terms of efficiency and scalability in [7], proving that the scaling behaviour of actual hardware implementations can differ to the expected scaling behaviour. Directly related to the topic of this paper is the work done in [8], the on-chip global synchronization unit presented there shares ideas with the work presented in this paper. In [8] this synchronization unit is only simulated, in our paper we actually implemented such an on-chip synchronization aid in hardware too. On the other hand in [9] hardware implementations of basic synchronization mechanisms are described. This was done here for comparison reasons as well, previously to designing and implementing the synchronization unit which is the main part of this work. The mechanisms realized in [10] are building on vendor-specific bus systems, lacking the advantage of direct on-chip synchronization realized here. Similar system architectures based on FPGAs are discussed in [10] with respect to accelerating data processing. #### III. HARDWARE SYNCHRONIZATION A hardware-environment based on hard-wired processor cores and on-chip shared memory is the fundament for the implementation and practical verification of the concepts presented in this paper. The main focus is on how to achieve reliable communication between the processor cores using the on-chip shared memory. Fig. 1. Scheme of an efficient race for access #### A. Problem Synchronization between the arbitrary number of cores in their access to the shared memory is necessary. An efficient mechanism to resolve arbitrary concurrent requests for our critical resource should fulfill the following demands: - Efficiency: - only cores actually competing for access attend the race and can become its winner - the race itself must not consume much time ideally the race elects one winner per cycle - Fairness: - no competitor waits indefinitely to get access - each competitor is served in a finite time - = ideally the worst-case waiting time is bounded only by the number of processor cores All requirements are met with the synchronization mechanism developed and described in the following. #### **B.** Concept In order to fulfil the efficiency and fairness demands a synchronization mechanism has been developed. A simple round robin scheme cycling all available processor cores would require minimal resources for implementation but would be very inefficient when only a few cores want to access the shared memory. As shown in Fig. 1 only the cores which really need access are considered for it by our mechanism. Processor cores that have to wait are blocked until it is their turn. The worst case occurs in the situation when all the available cores want access to the shared memory simultaneously, resulting in different waiting times. In order to avoid any processor core to be favoured or discriminated a dynamic priority scheme is used to choose the access-order. A global locking scheme making not just arbitrary single but also multiple accesses to the shared memory atomic is present as well. Global locking is offered by a global locking bit that is shared by all processor cores in the system. Fig. 2. Parallel access to different memory-regions by address-sensitivity Locking the whole shared memory when accessing essentially only a small area of the memory is not very efficient. Therefore a special form of address-sensitive locking that allows the locking of just blocks instead of the whole shared memory has been developed and implemented. This enables concurrent read- and write-access to regions of shared memory, as is demonstrated for two cores in Fig. 2. For event synchronization simple barriers using fixed configuration bit patterns and more flexible complex barriers have been developed, the latter ones allow to specify multiple barriers for different subsets of the available processor cores. #### C. Realization The developing Multi-Access Controller (MACtrl) consists of core-side and inter-core logic as shown in Figure 3. Fig. 3. Multi access- controller (MACtrl), abstraction A fully generic design of the MACtrl will be developed in the hardware description language (VerilogHDL) in order to allow easy scaling in terms of the processor cores. The goal to keep the design as compact as possible is achieved by a code optimized algorithm that selects the next core that is allowed to access the shared memory in case of concurrent requests. In order to execute multiple accesses atomically also global locking is implemented using a variation of the algorithm. Address-sensitive locking is activated as soon as an upper and a lower address of a memory block is loaded into the respective registers of the MACtrl. Then the block of memory is tried to be locked by the controller. Due to their very nature, global locking and address-sensitive locking are implemented to be mutual exclusive - the two access methods cannot be used simultaneously to access the shared memory. Simple barriers are the easiest method to achieve efficient point-to-point synchronization: each core that wants to meet at a given point of execution writes an arbitrary value to a dedicated barrier register of the MACtrl. Then the respective processor core is blocked until at least one other core writes to its corresponding counterpart-register. other cores to wait for. Each bit in the register corresponds to the fixed number of a processor core in the system. The drawback is that the number of the cores must be known at compile-time. A more flexible replacement for the extended simple barriers are complex barriers. They allow a more hardware-remote — abstracted — level of programming: the numbers of the processor cores must not be known in advance, only the number of other cores to wait for, making it irrelevant on what processor core the program will eventually be executed. #### IV. IMPLEMENTATION All the hardware synchronization methods described in the previous Section were successfully simulated for four, some of them also for eight processor cores. From the beginning on there was no assumption limiting the number of cores. All implemented types of synchronization mechanisms will be tested. As most important case the worst case of incessant access to the shared memory was simulated with the help of assembler routines. The core frequency of the processor cores was 100 MHz when conducting the following test data sets Fig. 5. Performance of single accesses to shared memory over all methods #### **B. RESULTS** Figure 5 and Table I clearly show the superiority of the MACtrl with implicit and explicit locking to the badly performing spinning software locking methods. A total of 10 million successive single or paired accesses (read/write-pairs) to the same word in shared memory are executed in sum by both processor cores, resulting in countless conflicts between them. In order to test address-sensitive locking an artificial scenario with overlapping memory blocks was constructed. Each processor core reserves a memory block, accesses it and then relocates it by an arbitrary offset. Relocation is done in opposite directions; hence conflicts between the processor cores occur. The promising results are presented in Table II. | Synchronization Details | No Massive<br>Contention | | |--------------------------------------|--------------------------|------| | ([access-target], [locking details]) | (ms) | | | MACtrl | 1030 | 1054 | | Spinning lock in on-chip memory | 2734 | 4426 | | Spinning PLB-lock | 2890 | 5512 | | Spinning OPB-lock | 3681 | 7071 | | Shared SDRAM, spinning PLB-lock | 4058 | 8116 | | Shared SDRAM, spinning OPB-lock | 4874 | 8627 | Table I Paired Accesses, Exact Values With Lock-Details | Synchronization Details | No Massive<br>Contention | | |--------------------------------------|--------------------------|-------| | ([access-target], [locking details]) | (ms) | | | MACtrl address lock | 12144 | 12767 | | MACtrl global | 13628 | 27496 | | Spinning lock in on-chip memory | 12186 | 24608 | | Spinning PLB-lock | 12707 | 24654 | | Spinning OPB-lock | 12722 | 24679 | | Shared SDRAM, spinning PLB-lock | 22215 | 46001 | | Shared SDRAM, spinning OPB-lock | 22232 | 47241 | Table II Block Accesses To Different Regions Of Memory #### V. CONCLUSION The problem of synchronization in multi-core systems with shared memory demands for efficient and reliable solutions, in particular for embedded systems. An approach is to integrate the synchronization mechanisms, which are normally based on locks, into the on-chip hardware. This guarantees fairness and stability, avoiding poor performance under load, starvation and even deadlock. A specialization of the multi-core approach to a dual-core PowerPC system proved the clear superiority of the hardware over software solutions that will be implemented. #### REFERENCES [1] K. Asanovic et al., "A view of the parallel computing landscape," in Communications of the ACM, Vol. 52, No. 10. ACM Press, October 2009, pp. 56–67. - [2] D. McGrath, "Intel rolls quad-core CPUs for embedded computing." EE Times, April 2007. [Online]. Available: http://www.eetimes.com - [3] T. Takayanagi et al., "A dual-core 64b UltraSPARC Microprocessor for Dense Server Applications," Sun Microsystems, Sunnyvale, USA, 2004. - [4] D. E. Culler and J. P. Singh, "Parallel Computer Architecture, a hw/sw approach." Morgan Kaufmann Publishers, Inc., Editorial and Sales Office, San Francisco, U. S. A., 1999. - [5] R. Johnson et al., "A New Look at the Roles of Spinning and Blocking," in Proceedings of the Fifth International Workshop on Data Management on New Hardware, Providence, Rhode Island. ACM Press, June 2009. - [6] C. Ferri, I. Bahar, M. Loghi, and M. Poncino, "Energy-optimal synchronization primitives for single-chip multi-processors," in GLSVLSI'09, Boston, Massachusetts. ACM Press, May 2009, pp. 141–144. - [7] J. Sampson et al., "Fast synchronization for chip multiprocessors," in ACM SIGARCH Computer Architecture News, Vol. 33, UCSD, UPC Barcelona, Palo Alto, California, 2005. - [8] O. Vila, G. Palermo, and C. Silvano, "Efficiency and scalability of barrier architectures," in CASES'08, Atlanta, Georgia, USA. ACM Press, October 2007, pp. 81–89. - [9] E. W. Lynch and G. F. Riley, "Hardware supported time synchronization in multi-core architectures," in ACM/IEEE/SCS 23rd workshop on principles of advanced and distributed simulation. IEEE Press, 2009. - [10] A. Tumeo et al., "HW/SW methodologies for synchronization in FPGA multiprocessors," in FPGA'09, Monterey, California, USA. IEEE Press, 2009. # Internet Access in Remote Areas Using NXP Microcontroller and GSM Modem #### Shahid M. Zubair<sup>1</sup>, Nikhila Ramesh<sup>2</sup>, Divyashree G<sup>3</sup> Dept. of Telecommunication, Atria Institute of Technology ASKB Complex, Anandnagar, Bangalore – 560024, Karnataka, India E-mail: 1shahidmzubair@gmail.com, 2nikhilaramesh08@gmail.com, 3divyashree730@gmail.com Abstract - In India and many developing and under-developed countries, providing internet facility to many school-going students is still a far-fetched concept. Also, in places like DRDO, DRDL and ISRO campuses, where internet is blocked due to security reasons, lack of internet causes many problems even to those who have blocked it. In places where natural calamities have struck or in warzones where the attacker has taken down the internet infrastructure, contacting loved ones through social forums or just accessing basic internet facilities is impossible. In this paper, which is based on our project, we propose a new system which circumvents the internet architecture by using GSM infrastructure to access internet services, albeit on a reduced scale. This can be done by using a GSM Modem interfaced to a microcontroller (NXP in this case.) The command sets of each service offered in the system are integrated in the source code. The GSM Modem is controlled through AT Commands and sends SMS messages to access internet services and receives information in the form of SMS messages. All subsystems in the module are interfaced to the microcontroller. Keywords- GSM Modem; NXP Microcontroller; AT Commands; GSM Architecture; GSM Modem. #### I. INTRODUCTION Internet usage in today's era has gone up by leaps and bounds. The rise of social networks has fuelled its growth in recent times. The need for communication and the hunger for knowledge has stimulated this rise. With the advent of 3G and 4G technologies, internet access has become fast and widespread. New wireless standards have allowed Wi-Fi users to have lightning fast speeds and long range. An estimated 2.28 billion people use internet worldwide today. All these technologies, however, will be useless if there is no internet service available. No matter how advanced a device is or which generation it belongs to, it simply won't work if the internet architecture in an area breaks down or does not exist. If we venture out to rural areas in developing countries like India, chances are we won't be able to access even the most basic of internet facilities. This is because there is no internet service provider (ISP) in these areas. The internet infrastructuredoes not exist in these areas as it is not considered as profitable. Internet was the major driving force in the popular uprisings in the Middle East recently. Protests were organized via social networking sites like Facebook and Twitter. Thus, it is no surprise why during wars, the aggressor always looks to take out or cripple the internet infrastructure of the defender. In such cases, there is no way for the populace of the defender, both local and foreign, to be in touch with their families through internet. As good internet can be in this modern age, it also poses risks when it comes to security and confidentiality. Due to its widespread use, attacks on unsuspecting users have become frequent. The victim, however, may not be just a passive user. Many governmental and military agencies have been victims of cyber attacks in the past. Some agencies, to prevent such attacks, are prepared to let go of the benefits of internet for security and block it in their premises. Such agencies include ISRO and DRDO. Such measures have, of late, reduced their efficiency as there can be no substitute to the services offered by internet in terms of pure intellectual power. There can be many other scenarios where lack of internet access poses many problems. Hence, there is need to develop new technologies which can provide internet access in such remote areas. In our project, we attempt to create one such technology which circumvents the entire internet infrastructure to access a limited number of internet services. There are three levels of internet access: 1.Full-internet, which can be accessed through laptops and PCs. 2.Mini-internet, which is accessed from tablets and smart phones. 3. Micro-internet, which is accessed using other mobile phones. As we go from full to micro level internet, we see that the scale of the devices reduces and the number of internet services offered also reduces. In our project, we introduce another level of internet, called Nanointernet, through which we offer a limited number of internet services on a reduced scale with respect to both size and cost. #### II. SYSTEM ARCHITECTURE The system being developed uses fairly low-cost components. The idea is to access internet services using a GSM modem through SMS messages. Figure 1 showsthe block diagram of our system which shall henceforth be called as 'Nano Internet System' or simply NIS. Our goal is to create a low-cost device which can access internet in remote areas and provide a meaningful number of services. The input and output devices are a 4x4 keypad and a 16x2 LCD display respectively. The service requests are handled by the microcontroller. The microcontroller used is NXP P89V51RD2, manufactured by NXP. We use a SIM300 GSM modem to send and receive text messages from the NIS. Since the GSM modem works on a TTL serial interface, we use a MAX232 IC which converts signals from a RS-232 serial port to signals suitable for use in TTL compatible digital logic circuits. The MAX232 is a dual driver/receiver and typically converts the RX, TX, CTS and RTS signals. The 4x4 keypad is used to give input to the NIS. It consists of 16 keys. It is interfaced such that it works like the alphanumeric keypad of mobile phones i.e. rapidly pressing a key will input a different letter, number or symbol. Other features like moving through the menu have also been interfaced to the keypad. The NIS uses a menu-based system. All the services offered in the system are displayed in the menu. The user should select the desired service and use it. The menu is hard-wired. Hence, to change it, the entire source code has to be upgraded. This is one of the disadvantages of the system as it does not provide a real-time menu because it cannot be changed in real-time. Figure 1. NIS block diagram The internet interface is the source code. Each service offered has its own command set which has to be integrated into the source code as a separate function. The integrated command set must be in the form of AT commands which are to be sent to the GSM modem. A complete discussion on AT commands and their use in NIS is done in subsequent sections. The source code is written in embedded C language. Currently, testing and troubleshooting is being done using the $\mu VisionKeil$ and Flash Magic software. Hyper Terminal is used to interface the GMS Modem to the PC to test each service and GSM modem routines. #### A. Internet Services Offered A number of internet services can be accessed by sending SMS. Figure 2 shows some of them. However, not all of them can be used without at least a partial use of the internet architecture. Also, some of them cannot be accessed in some countries. Thus, the services which can be provided in the NIS differ in different areas. Hence, the system must make optimum use of services provided by global service providers (SPs) like Google. The system we designed made use of a number of Google services provided by its Google Labs service. Now, many services which are provided by independent SPs are provided in the NIS. Figure 2. Some of the services offered in NIS Each service has a command set of its own. This command set includes the form of the text message to be sent to perform different functions provided in the service. It also includes the number to which the text message has to be sent. This number differs for different geographical areas. For example, the number to use the Google SMS Search service differs in India from the one in USA. Thus, the NIS has to incorporate different source codes based on different geographic locations and include an option in the menu which lets the user decide the geographic location he/she is in.Alternately, we can use just one source code for all regions by suitably altering the different criteria for different regions based on the region selected by the user. Thus memory requirements of the source code can be kept to a minimum, although this would not pose any problems as the NXP P89V51RD2 has 64kB of flash memory and 1kB of RAM. Additional memory can be added by using an EPROM. The command sets of each service must be incorporated in the source code in the form of AT commands which are used to communicate with the GSM modem. AT commands are dealt with in the next section. #### B. GSM Modem and AT Commands ### TABLE I #### SOME SMS ATCOMMANDS USED INNIS | Command | Description | | | |---------|---------------------------------------------|--|--| | AT | Check if serial interface and GSM | | | | | modem are working. | | | | ATE0 | Turn echo off, less traffic on serial line. | | | | AT+CNMI | Display new incoming SMS. | | | | AT+CPMS | Selection of SMS memory. | | | | AT+CMGF | Select the input and output format of | | | | | SMS messages. | | | | AT+CMGR | Read new message from a given | | | | | memory location. | | | | AT+CMGS | Send message to a given recipient. | | | | AT+CMGD | Delete message. | | | | AT+CSMS | Select message service. | | | | AT+CMGL | List messages. | | | | AT+CMSS | Send message from storage. | | | | AT+CMGW | Write message to memory. | | | The GSM modem can be used by sending AT Commands to it. AT is an abbreviation for 'Attention Terminal' or 'Attention Telephone.' They are also called Hayes AT commands because they were introduced by Hayes Communications[x]. There are various AT command sets available for different purposes. The command set differs for different modems. A command set is available for each modem. The command set for a modem may include all or any combination of the call control, data card control, phone control, computer data card interface, service, network communication and SMS commands. For our purposes, we make use of the SMS commands along with some other basic commands. Table I shows some of the basic SMS AT Commands used in NIS. Each command has a different syntax. For example, to send to send a message, AT+CMGS is used with the syntax: AT+CMGS = "<Number>" <Enter> <Service\_command><Ctrl+z><Enter> On execution of this command, the service command entered is sent to the specified number. The service command can be any command included in the command set of a service or it can be a text message which the user wants to send to a particular number. **Figure 3.** Flow chart to send a message using the GSM modem Figure 3 shows the flow chart to send a message. After each AT command is sent to the GSM modem, the microcontroller waits for the modem to send an acknowledgement. The GSM modem can work in two modes: - PDU mode - Text mode In the PDU mode, reading and sending SMS is done in a special encoded format. This compressed format saves message payload and is default in most modems. However, the PDU mode supports very few SMS AT commands. Hence, in NIS we use the text mode where reading and sending SMS is done in plain text. #### C. NXP P89V51RD2 Microcontroller The NIS device uses an NXP P89V51RD2 microcontroller. A key feature of the P89V51RD2 is its X2 mode option. The design engineer canchoose to run the application with the conventional 80C51 clock rate (12 clocks permachine cycle) or select the X2 mode (6 clocks per machine cycle) to achieve twicethe throughput at the same clock frequency. Since internet access is measured in terms of the speed with which it is accessed, we can use the X2 mode to decrease the send/fetch time of SMS in the system. Apart from this, the microcontroller has a 64kB flash memory and a 1kB RAM. This will prove to be useful when a large number of services are provided to users in almost regions. The In-System Programming (ISP) feature will also allow us to reprogram the system at any stage of its development under software control. Thus, the capability to update the source code makes a wide range of applications possible. Figure 4. P89V51RD2 block diagram The microcontroller used is also In-Application Programmable (IAP.) This allows the flash program memory to be reconfigured even when the program is running i.e. the program can be changed when the device is in the middle of an operation like sending an SMS. As Figure 4 shows, the microcontroller provides three 16-bit timer/counter. The requirement of timers in NIS is critical. In a scenario where the GSM does not respond, we do not want the microcontroller to wait for a reply indefinitely. Hence, a timer is used to perform a time-out operation to re-perform an operation or show an error message on the display. Thus, the microcontroller will not enter a state where it will wait for an event to occur forever. This can also be done for other scenarios where time-out operation is necessary. The ports P0 to P2 are used to interface different subsystems. The port P3 has pins for serial input and output, external interrupts and external data memory read/write strobes. It also pins for external count input for timer/counter 0/1. #### III. WORKING OF NIS Figure 5 shows the flow data in the NIS. The 4x4 keypad is used to input the data to the NIS. The input will include the selection of the service from the menu by the user. The menu is displayed on the 16x2 LCD display. It will consist of a complete list of the services offered by the system. The services offered will differ for different regions. The input will also include a function of the service. For example, while using the Calendar SMS service, different functions such as requesting scheduled events for the current day, next day or the next scheduled event. All the input data is received by the microcontroller which, from the source code, generates and sends a service request to the GSM modem. In figure 5, the service request generator/receiver is the microcontroller. The internet services interface is the source code which has the command sets of all the internet services offered in the device in the form of AT commands. When the service request is sent from the microcontroller to the GSM modem, it is in the form service-command-incorporated-AT-command. The GSM modem performs the desired function and sends an acknowledgement to the microcontroller. The GSM modem receives data in the form SMS messages and sends it to the microcontroller. The microcontroller then displays the received message on the 16x2 LCD display. But there is problem in this operation. If we use services like SMS Search, the reply consists of not just the search but also unwanted text like ads. This unwanted text can be present in any part of the SMS, the start, end or somewhere in the middle. This calls for a modification in the source code to include a text processing function. This function should analyze the received text, detect unwanted text and remove it before it is displayed. Figure 5. Flow of data in NIS The fact that the content and position of unwanted texts in the received message is always the same for a particular service makes it a little easier to write such a text processing function. If the content of the unwanted text is always the same, a simple if-else instruction will form the text processing function. If the position of the unwanted text is always the same, the system can perform a function to only display the rest of the message. The job becomes easier if both the content and the position of the unwanted text are the same. Some services do not require an input from the keypad. They are received when the SP sends an SMS. An example is the Bangalore-specific service provided by the Bangalore Traffic Police. This service informs the user of the roads in the city which he/she should avoid due to traffic jams. The messages from this service are received at pre-destined times or when an arterial road is jammed. On an average, 4 messages are received per day. This is only an example and there many other services which fall under this category. The user will have an option to allow or block such messages as there are many messages from the network provider which one may wish to avoid. # IV. RESULTSCONCLUSION AND FUTURE ENHANCEMENTS All the services are accessed by circumventing the internet architecture. The cost of accessing is virtually zero as internet data rates are avoided. The only charges one may incur are those of the SMS service. If the user has an SMS pack, this cost will also be removed. Thus, the only potential cost for the user would be the cost of the device itself, which, considering the low cost components being used, will not be high. Thus, our goal of creating a low-cost system which can access internet services without directly accessing internet is achieved. Today's world requires technologies which enhance our experience of the web. Technologies like 4G and standards like IEEE 802.11ac Wi-Fi Protocol greatly expand the scope of the net. But what if the backbone on which these technologies, the internet infrastructure provided by ISPs, breaks down some day and pushes the world back to Stone Age? There must be an end-of-theworld insurance policy that will keep the world from breaking apart during such an apocalyptic event. NIS is one such effort to provide this policy. Though not complete and in no way final, it is a step forward towards achieving that ultimate technology which can provide cheaper access to internet for everyone. The bubbles left by ISPs where internet access is not possible have to be filled up and NIS is the way forward now. It is obvious that this system is not complete. Thus, it has no future without it being enhanced. Many enhancements have been discussed and adopted or dropped. The main let-down in this system is the display and the keypad, the two main subsystems with which the user interacts. In future, the 16x2 LCD display can be replaced by a 128x64 LCD display or even an LED display. This would pave way for images and videos to be available in the system. Of course, there must be a service which offers them. It also requiresmuch advancement in GSM technologies and policy changes (read SMS cap) in the way SMS messages are delivered. Many services like sending e-mails, accessing Facebook and chatting on Gmail are on hold due to different problems which can only be overcome when there are massive policy changes in India. However, they can be easily accessed in other countries. Such hurdles, when overcome, will clear the road for many such interactive services to be used in the system, making NIS a truly cutting edge system which provides the most popular web services without accessing the internet directly. #### REFERENCES - Kenneth J. Ayala, The 8051 microcontroller, USA, WP Company, 1991. - [2] I. Scott MacKenzie and Raphael C.W. Phan, *The* 8051 microcontroller, 3<sup>rd</sup> ed. USA, Prentice Hall, 1999. - [3] Thomas W. Schultz, *C and the 8051*, 4<sup>th</sup> ed. USA, PageFree Publishing Inc, 2004. - [4] Nicola Pero, SMS Messaging Applications, USA, O'Reilly Media, 2009. - [5] Atmel Corporation, Application Note, ATMEL: AVR323 Interfacing GSM Modem [Online], Available: http://www.atmel.com/Images/doc8016.pdf - [6] Developer's Home. Introduction to GSM / GPRS WirelessModems [Online], Available:http://www.developershome.com/sms/GS MModemIntro.asp - [7] Google. Google Services [Online], Available: http://www.google.com/mobile/sms/ - [8] Philips Semiconductors, P89V51RD2 Manual [Online], Available: http://www.keil.com/dd/docs/datashts/philips/p89v5 1rd2.pdf # Implementation of An 4 Bit - ALU Using Low-Power And Area-Efficient Carry Select Adder #### U.Sreenivasulu & T.Venkata Sridhar ASCET.Gudur Nellore, Andhra Pradesh, India E-mail: upputuri.sreenivasulu@gmail.com & venkatasridhar.ece@audisankara.com Abstract - In computing, an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The purpose of this work is to design, implement and experimentally check an Arithmetic Logic Unit (ALU) using Low-Power and Area-Efficient Carry Select Adder. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA which would be incorporated in ALU. Carry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. Based on this modification 8-, 16-, 32-, and 64-b square-root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only a slight increase in the delay. This work evaluates the performance of the proposed designs in terms of delay, area, power, and their products by hand with logical effort and through custom design and layout in 0.18- m CMOS process technology. The results analysis shows that the proposed CSLA structure is better than the regular SQRT CSLA. So by this analysis we shows that the proposed CSLA structure for the implementation of ALU is better in terms of area and power which will leads the better utilization of the processor. Keywords— Arithmetic logic unit (ALU), Application-specific integrated circuit (ASIC), area-efficient CSLA, low power. #### I. INTRODUCTION The ALU is a fundamental building block of the central processing unit (CPU) of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers. The processors found inside modern CPUs and graphics processing units accommodate very powerful and very complex ALUs; a single component may contain a number of ALUs. Most of a processor's operations are performed by one or more ALUs. An ALU loads data from input registers, an external Control Unit then tells the ALU what operation to perform on that data, and then the ALU stores its result into an output register. The Control Unit is responsible for moving the processed data between these registers, ALU and memory. An ALU is a combinational circuit that performs arithmetic and logic operations on a pair of operands. The operations performed by an ALU are controlled by a set of function-select inputs. The functions performed by the ALU are given below. Logical Operations: AND OR NOT XOR Left shift Right shift Arithmetic Operations: Addition Subtraction Increment Decrement Below shown a simple 4bit ALU with Statistics: 14 inputs; 8 outputs; 61 gates; gate-level schematic Function: The 74181 can be modeled as below. Recognizing the logic that makes up a CLA block in this case, the circled elements in the gate-level schematic is the key step in unraveling the secrets of the 74181. Fig. 1. 74181 4-Bit ALU/Function Generator The four boxed circuits in the gate-level schematic are represented above by the single module M1 with 4-bit I/O buses. The second quadruplicated circuit in the 74181 leads to the high-level module M2. The various XOR gates are also grouped into 4-bit word gates as indicated above. Further analysis shows that the 74181s original designers cleverly constructed the M1 and M2 logic so that with input line M=1, each setting of the S(function select) bus produces one of the 16 possible Boolean functions of the form F(A,B). Modern day circuit design is done mainly using hardware description languages (HDLs.). Two popular HDLs are Verilog and VHDL. In this simulation we will use VHDL. VHDL can be written using either structural or behavioral descriptions. Once written, VHDL code can then be run through sophisticated CAD tools such as a Xilinx tool that will generate the actual low-level gates. Design of area- and power-efficient high-speed data path logic systems are one of the most substantial areas of research in VLSI system design. In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum [1]. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input Cin = 0 and Cin = 1, then the final sum and carry are selected by the multiplexers (mux). Fig. 2. Delay and Area evaluation of an XOR gate 4-Bit Binary to Excess-1 8:4 mux S3 S2 S1 S0 Fig. 4. 4-b BEC with 8:4 mux. The basic idea of this work is to use Binary to Excess-1 Converter (BEC) instead of RCA with Cin = 1 in the regular CSLA to achieve lower area and power consumption [2]–[4]. The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure. The details of the BEC logic are discussed in Section III. This brief is structured as follows. Section II deals with the delay and area evaluation methodology of the basic adder blocks. Section III presents the detailed structure and the function of the BEC logic. The SQRT CSLA has been chosen for comparison with the proposed design as it has a more balanced delay, and requires lower power and area [5], [6]. The delay and area evaluation methodology of the chosen for comparison with the proposed design as it has a more balanced delay, and requires lower power and area [5], [6]. The delay and area evaluation methodology of the regular and modified SQRT CSLA are presented in Sections IV and V, respectively. The ASIC implementation details and results are analyzed in Section VI. Finally, the work is concluded in Section VII. #### II. DELAY AND AREA EVALUATION METHODOLOGY OF THE BASIC ADDER BLOCKS Fig. 5. Regular 16-b SQRT CSLA The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig. 2. The gates between the dotted lines are performing the operations in parallel and the numeric representation of each gate indicates the delay contributed by that gate. The delay and area evaluation methodology considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder (HA), and Full adder (FA) are evaluated and listed in Table I.As stated above the main idea of this work is to use BEC instead of the RCA with Cin = 1 in order to reduce the area and power consumption of the regular CSLA. To replace the n-bit RCA, an n+1-bit BEC is required. A structure and the function table of a 4-b BEC are shown in Fig. 3 and Table II, respectively. TABLE I DELAY AND AREA COUNT OF THE BASIC BLOCKS OF CSLA | Adder blocks | Delay | Area | |--------------|-------|------| | | | | | XOR | 3 | 5 | | 2:1 Mux | 3 | 4 | | Half adder | 3 | 6 | | Full adder | 6 | 13 | | | | | TABLE II FUNCTION TABLE OF THE 4-b BEC | B[3:0] | X[3:0] | | |------------------|----------------------------------------|--| | 0000<br>0001<br> | 0001<br>0010<br>1<br>1<br>1111<br>0000 | | #### III. BEC As stated above the main idea of this work is to use BEC instead of the RCA with Cin = 1 in order to reduce the area and power consumption of the regular CSLA. To replace the n-bit RCA, an n+1-bit BEC is required. A structure and the function table of a 4-b BEC are shown in Fig. 3 and Table II, respectively. Fig. 4 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with the mux. One input of the 8:4 mux gets as it input (B3, B2, B1, and B0) and another input of the mux is the BEC output. This produces the two possible partial results in parallel and the mux is used to select either the BEC output or the direct inputs according to the control signal Cin. The importance of the BEC logic stems from the large silicon area reduction when the CSLA with large number of bits are designed. The Boolean expressions of the 4-bit BEC is listed as (note the functional symbols ~ NOT, & AND, ^ XOR). $X0= \sim B0$ $X1= B0 \land B1$ $X2= B2 \land ((B0 \& B1)$ $X3= B3 \land (B0 \& B1\& B2)$ Fig. 6. Delay and area evaluation of regular SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. F is a Full Adder. #### IV. DELAY AND AREA EVALUATION METHODOLOGY OF REGULAR 16-B SQRT CSLA Fig. 7. Modified 16-b SQRT CSLA. The parallel RCA with Cin = 1 is replaced with BEC The structure of the 16-b regular SQRT CSLA is shown in Fig. 5. It has five groups of different size RCA. The delay and area evaluation of each group are shown in Fig. 6, in which the numerals within [] specify the delay values, e.g., sum2 requires 10 gate delays. The steps leading to the evaluation are as follows. - The group2 [see Fig. 6(a)] has two sets of 2-b RCA. Based on the consideration of delay values of Table I, the arrival time of selection input c1 [time (t) = 7] of 6:3 mux is earlier than s3[t = 8] and later than s2[t = 6]. Thus, sum3[t = 11] is summation of s3 and mux [t = 3] and sum2[t = 10] is summation of c1 and mux. - 2) Except for group2, the arrival time of mux selection input is always greater than the arrival time of data outputs from the RCA's Thus, the delay of group3 to group5 is determined, respectively as follows: $${c6, sum [6:4]} = c3 [t = 10] + mux$$ ${c10, sum [10:7]} = c6 [t = 13] + mux$ $\{\text{cout}, \text{sum } [15:11]\} = \text{c10} [\text{t} = 16] + \text{mux}.$ The one set of 2-b RCA in group 2 has 2 FA for Cin = 1 and the other set has 1 FA and 1 HA for Cin = 0. Based on the area count of Table I, the total number of gate counts in group 2 is determined as follows: #### V. DELAY AND AREA EVALUATION METHODOLOGY OF MODIFIED 16-B SQRT CSLA The structure of the proposed 16-b SQRT CSLA using BEC for RCA with Cin = 1 to optimize the area and power is shown in Fig. 6. We again split the structure into five groups. The delay and area estimation of each group are shown in Fig. 7. The steps leading to the evaluation are given here. - 1) The group2 [see Fig. 8(a)] has one 2-b RCA which has 1 FA and 1 HA for Cin = 0. Instead of another 2-b RCA with Cin = 1 a 3-b BEC is used which adds one to the output from 2-b RCA. Based on the consideration of delay values of Table I, the arrival time of selection input c1[time(t) = 7] of 6:3 mux is earlier than the s3[t = 9] and c3[t = 10] and later than the s2[t = 4]. Thus, the sum3 and final c3 (output from mux) are depending on and mux and partial c3 (input to mux) and mux, respectively. The sum2 depends on c1 and mux. - 2) For the remaining group's the arrival time of mux selection input is always greater than the arrival time of data inputs from the BEC's. Thus, the delay of the remaining groups depends on the arrival time of mux selection input and the mux delay. #### VI. ASIC IMPLEMENTATION RESULTS Gate count = 57 (FA + HA + Mux) $$FA = 39 (3 * 13)$$ $$HA = 6 (1 * 4)$$ Similarly, the estimated maximum delay and area of the other groups in the regular SQRT CSLA are evaluated and listed in Table III. TABLE III DELAY AND AREA COUNT OF REGULAR SQRT CSLA GROUPS | Group | Delay | Area | | |--------|-------|------|--| | Group2 | 11 | 57 | | | Group3 | 13 | 87 | | | Group4 | 16 | 117 | | | Group5 | 19 | 147 | | The area count of group2 is determined as follows: Gate count = $$43 (FA + HA + MUX + BEC)$$ $$FA = 13(1 * 13)$$ $$HA = 6(1 * 6)$$ AND = 1 NOT = 1 XOR = 10(2 \* 5) $$Mux = 12(3 * 4).$$ 3) Similarly, the estimated maximum delay and area of the other groups of the modified SQRT CSLA are evaluated and listed in Table IV. Comparing Tables III and IV, it is clear that the proposed modified SQRT CSLA saves 113 gate areas than the regular SQRT CSLA, with only 11 increases in gate delays. To further evaluate the performance, we have resorted to ASIC implementation and simulation. The design proposed in this paper has been developed using Verilog-HDL and synthesized in Cadence RTL compiler using typical libraries of TSMC 0.18 um technology. The synthesized Verilog netlist and their respective design constraints file (SDC) are imported to Cadence SoC Encounter and are used to generate automated layout from standard cells and placement and routing [7]. Parasitic extraction is performed using Encounter's Native RC extraction tool and the extracted parasitic RC (SPEF format) is back annotated to Common Timing Engine in Encounter platform for static timing analysis. For each word size of the adder, the same value changed dump (VCD) file is generated for all possible input conditions and imported the same to Cadence Encounter Power Analysis to perform the power simulations. The similar design flow is followed for both the regular and modified SQRT CSLA. TABLE IV DELAY AND AREA COUNT OF MODIFIED SQRT CSLA | Group | Delay | Area | | |--------|-------|------|--| | Group2 | 13 | 43 | | | Group3 | 16 | 61 | | | Group4 | 19 | 84 | | | Group5 | 22 | 107 | | Table V exhibits the simulation results of both the CSLA structures in terms of delay, area and power. The area indicates the total cell area of the design and the total power is sum of the leakage power, internal power and switching power. The percentage reduction in the cell area, total power, power-delay product and the area—delay product as function of the bit size are shown in Fig. 8(a). Also plotted is the percentage delay overhead in Fig. 8(b). It is clear that the area of the 8-, 16-, 32-, and 64-b proposed SQRT CSLA is reduced by 9.7%, 15%, 16.7%, and 17.4%, respectively. Fig. 8. Delay and area evaluation of modified SQRT CSLA: (a) group2, (b)group3, (c) group4, and (d) group5. HA is a Half Adder. The total power consumed shows a similar trend of increasing reduction in power consumption 7.6%, 10.56%, 13.63%, and 15.46 % with the bit size. Interestingly, the delay overhead also exhibits a similarly decreasing trend with bit size. The delay overhead for the 8, 16, and 32-b is 14%, 9.8%, and 6.7% respectively, whereas for the 64-b it reduces to only 3.76%. The power-delay product of the proposed 8-b is higher than that of the regular SQRT CSLA by 5.2% and the area-delay product is lower by 2.9%. However, the power-delay product of the proposed 16-b SQRT CSLA reduces by 1.76% and for the 32-b and 64-b by as much as 8.18%, and 12.28% respectively. Similarly the area-delay product of the proposed design for 16-, 32-, and 64-b is also reduced by 6.7%, 11%, and 14.4% respectively. Fig. 9. (a) Percentage reduction in the cell area, total power, power–delay product, and area–delay product. (b) Percentage of delay overhead. #### VII. CONCLUSION A simple approach is proposed in this paper to reduce the area and power of SQRT CSLA architecture. The reduced number of gates of this work offers the great advantage in the reduction of area and also the total power. The compared results show that the modified SQRT CSLA has a slightly larger delay (only 3.76%), but the area and power of the 64-b modified SQRT CSLA are significantly reduced by 17.4% and 15.4% respectively. The power-delay product and also the area-delay product of the proposed design show a decrease for 16-, 32-, and 64-b sizes which indicates the success of the method and not a mere tradeoff of delay for power and area. The modified CSLA architecture is therefore, low area, low power, simple and efficient for VLSI hardware implementation. It would be interesting to test the design of the modified 128-b SQRT CSLA. ## TABLE V ${\it COMPARISON} \ {\it OF} \ THE \ REGULAR \ AND \ MODIFIED \\ {\it SQRT} \ CSLA$ | | | | | | Power (uW) | | | | |-----------|---------------|------------|------------|------------------|--------------------|-----------------|--------------------------------------------|-----------------------------------------| | Word Size | Adder | Delay (ns) | Area (um²) | Leakage<br>Power | Switching<br>power | Total<br>power* | Power-Delay<br>Product(10 <sup>-15</sup> ) | Area-Delay<br>Product(10 <sup>-21</sup> | | 8-bit | Regular CSLA | 1.719 | 991 | 0.007 | 101.9 | 203.9 | 350.5 | 1703.5 | | | Modified CSLA | 1.958 | 895 | 0.006 | 94.2 | 188.4 | 368.8 | 1752.4 | | 16-bit | Regular CSLA | 2.775 | 2272 | 0.017 | 263.7 | 527.5 | 1463.8 | 6304.8 | | | Modified CSLA | 3.048 | 1929 | 0.013 | 235.9 | 471.8 | 1438.0 | 5879.6 | | 32-bit | Regular CSLA | 5.137 | 4783 | 0.036 | 563.6 | 1127.3 | 5790.9 | 24570.2 | | | Modified CSLA | 5.482 | 3985 | 0.027 | 484.9 | 969.9 | 5316.9 | 21845.7 | | 64-bit | Regular CSLA | 9.174 | 9916 | 0.075 | 1212.4 | 2425.0 | 22246.9 | 90969.3 | | | Modified CSLA | 9.519 | 8183 | 0.057 | 1025.0 | 2050.1 | 19514.9 | 77893.9 | \*Total power = leakage power + Internal power + Switching power #### ACKNOWLEDGMENT The authors would like to thank S. Sivanantham, P. MageshKannan, and S. Ravi of the VLSI Division, VIT University, Vellore, India, for their contributions to this work. #### REFERENCES - [1] O. J. Bedrij, "Carry-select adder," IRE Trans. Electron. Comput., pp.340–344, 1962. - [2] B. Ramkumar, H.M. Kittur, and P. M. Kannan, "ASIC implementation of modified faster carry save adder," Eur. J. Sci. Res., vol. 42, no. 1, pp.53– 58, 2010. - [3] T. Y. Ceiang and M. J. Hsiao, "Carry-select adder using single ripple carry adder," Electron. Lett., vol. 34, no. 22, pp. 2101–2103, Oct. 1998. - [4] Y. Kim and L.-S. Kim, "64-bit carry-select adder with reduced area," Electron. Lett., vol. 37, no. 10, pp. 614–615, May 2001. - [5] J. M. Rabaey, Digtal Integrated Circuits—A Design Perspective. Upper Saddle River, NJ: Prentice-Hall, 2001. - [6] Y. He, C. H. Chang, and J. Gu, "An area efficient 64-bit square root carry-select adder for low power applications," in Proc. IEEE Int. Symp.Circuits Syst., 2005, vol. 4, pp. 4082–4085. - [7] Cadence, "Encounter user guide," Version 6.2.4, March 2008 ## Design of An Efficient Reversible Logic Based Bidirectional Barrel Shifter #### O. Anjaneyulu, T. Pradeep & C.V.Krishna Reddy<sup>2</sup> KITS,Warangal, Andhra Pradesh, India <sup>2</sup>KSN Inst.of Technology, Nellore,A.P,India E-mail: anjaneyulu\_o@yahoo.com, Pradeep.thumma@yahoo.com, cvkreddy2@gmail.com Abstract - Embedded digital signal processors and general purpose processors will use barrel shifters to manipulate data. This paper will present the design of the barrel shifter that performs logical shift right, arithmetic shift right, rotate right, logical shift left, arithmetic shift left, and rotate left operations. The main objective of the upcoming designs is to increase the performance without proportional increase in power consumption. In this regard reversible logic has become most popular technology in the field of low power computing, optical computing, quantum computing and other computing technologies. Rotating and data shifting are required in many operations such as logical and arithmetic operations, indexing and address decoding etc. Hence barrel shifters which can shift and rotate multiple bits in a single cycle have become a common choice of design for high speed applications. The design has been done using reversible fredkin and feynman gates. In the design the 2:1 mux can be implemented by fredkin gate which reduce quantum cost, number of ancilla bits and number of garbage outputs. The feynman gate will remove the fanout. By comparing the quantum cost, number of ancilla bits and number of garbage outputs the design is evaluated. Keywords- barrel shifters, quantum cost, ancilla bits, verilog. #### I. INTRODUCTION Rotating and shifting data is required in several applications including variable-length coding, arithmetic operations, and bit-indexing. Consequently, barrel shifters, which are capable of shifting or rotating data in a single cycle, are commonly found in both digital signal processors and general purpose processors. In reversible system information is not erased. Thus in reversible gates number of inputs and outputs are equal which means that the input stage can always be retained from the output stage. If a bit is erased in an irreversible circuit then it will dissipate kTln2 joules of heat energy where k is the Boltzmann's constant and T is the absolute temperature of environment [4]. There won't be dissipation of kTln2 joules of heat energy if the operations are performed in reversible manner based on reversible logic circuits [3]. Based on this observation, Bennett [3] showed, for a reversible computer the heat dissipation is exactly kTln1 which is logically zero. Thus reversible computation is a highly potential field for upcoming low power/high performance computing. Reversible logic also has the applications in emerging nanotechnologies such as quantum dot cellular automata, quantum computing, optical computing and low power computing, etc. The constraints involved in designing reversible circuits using reversible gates are: - a. The fan-out of every signal is equal to one. - b. Loops are not permitted in a strictly reversible system. On the other hand, data shifting and rotating is important and frequently used in arithmetic operations, bit-indexing, variable-length coding and many more. The reversible circuits have associated overhead in terms of number of garbage outputs and the number of ancilla inputs. The outputs which do not perform any useful operation and needed to maintain reversibility of the circuit are termed as garbage outputs, while an auxiliary constant input used to design a reversible circuit is called the ancilla input bit [17]. A (n,k) barrel shifter is a combinational circuit with n inputs and n outputs where k select lines controls the shift operation. The existing designs of the reversible barrel shifters can only perform the left rotate operation [11], [15]. The reversible barrel shifter can shift and rotate multiple bits in a single cycle and thus will be considerably faster than the reversible sequential shift register. This paper examines design alternatives for barrel shifters that perform the following operations: shift right logical, shift right arithmetic, rotate right, shift left logical, shift left arithmetic, and rotate left. In the proposed design the basic building blocks are reversible Fredkin and Feynman gates. The structure of the paper is as follows: Section II provides the necessary background on reversible logic as well as the definitions of some commonly used reversible logic gates. Section III describes several barrel shifters. Section IV describes the proposed design of the reversible bidirectional barrel shifter. In Section V, the performance analysis of the proposed shifter is presented. Lastly, the conclusions and further studies are discussed in Section VI #### II. REVERSIBLE GATES A Reversible Gate is an n-input, n-output (denoted by n \* n) circuit. To maintain the reversibility property of reversible logic gates several dummy output signals are needed to be produced in order to equal the number of input to that of output. These signals are commonly known as Garbage Outputs. For example, for reversible Exclusive-OR operation Feynman gates are used which produce an extra dummy output along with its principal output signal to preserve reversibility. The quantum cost of reversible gate is equal to the number of 1x1 and 2x2 reversible gates needed to design a 3x3 reversible gate. The quantum cost of all 1x1 and 2x2 reversible gates are considered as unity [18], [7], [2]. The 3x3 reversible gates are designed from 1x1 NOT gate, and 2x2 reversible gates such as Controlled-V and Controlled-V+ (V is a square-root of NOT gate and V+ is its hermitian), the Feynman gate which is also known as Controlled NOT gate. A NOT gate is 1x1 gate represented as shown in Fig. 1. Its quantum cost is unity since it is a 1x1 gate. $$A \longrightarrow P = \overline{A}$$ Fig. 1. NOT GATE The input vector, Iv and output vector, Ov for 2\*2 **Feynman Gate (FE)** is defined as follows: Iv = (A, B) and $Ov = (P = A \text{ and } Q = A \land B)$ . Feynman gates are typically used as copying gates. If Iv = (A, B = 0) then Ov = (P = A and Q = A). Fanout is not allowed in reversible logic. Feynman gate is helpful in this regard as it can be used for copying the signal by which it avoids the fanout problem as shown in Fig.2(c). Fig. 2. CNOT gate, its quantum implementation and its useful properties The input and output vector for 3\*3 Fredkin gate (FR) [1] are defined as follows: Iv = (A, B, C) and Ov= $(P=A, Q=A'B^AC \text{ and } R=A'C^AB)$ . Figure 3(a) shows the block diagram of a Fredkin gate. A Fredkin gate can work as 2:1 MUX, as it is able to swap its other two inputs depending on the value of its first input. The first input A works as a controlling input while the inputs B and C work as controlled inputs as shown in the Fig. 3(a). Thus when A=0 the outputs P and Q will be directly connected to inputs A and B and if A=1 the inputs B and C will be swapped resulting in the value of the outputs as Q=C and R=B. The quantum implementation of a Fredkin gate with a quantum cost of 5 is shown in Figure 3(b) [7]. In Fig. 3(b) each dotted rectangle is equivalent to a 2x2 Feynman gate and the quantum cost of each dotted rectangle is considered as 1 [18]. The same assumption is used for calculating the quantum cost of the Fredkin gate [7]. Thus, the quantum cost of the Fredkin gate is 5 as it consists of 2 dotted rectangle, 1 Controlled-V gate and 2 CNOT gate Fig. 3. Fredkin Gate and its quantum implementation #### III. BIDIRECTIONAL BARREL SHIFTER A barrel shifter is a combinational circuit which has n-input and n-output and m select lines that controls bit shift operation. A barrel shifter having n inputs and k select lines is called (n,k) barrel shifter. Barrel shifter can be unidirectional allowing data to be shifted or rotated only to left (or right), or bi-directional which provides data to be rotated or shifted in both the directions. The logarithmic barrel shifter is most widely used among the different designs of barrel shifter, because of its simple design, less area and the elimination of the decoder circuitry. An n-bit logarithmic barrel shifter has a total of log2(n) stages. Each stage determines whether to shift or not to shift the input data. The stage k will shift the input $2^k$ times if the control bit sk ( where k = 0, 1, ... (log 2(n)-1) ) is set to 1 otherwise the input will remain unchanged. Logarithmic shifter is more efficient in terms of design as well as area but delay cost is large [11]. This paper presents the designs of reversible bidirectional arithmetic and logical barrel shifter that can perform six operations: logical right shift, arithmetic right shift, right rotate, logical left shift, arithmetic left shift and left rotate. The existing shifter is a unidirectional logarithmic shifter consists of multiplexers. A 3x3 Fredkin Gate works as simple (2:1) multiplexers. Feynman gates are used for producing fanouts. The existing shifter is complex in design and requires large number of gates. As a result the total number of garbage outputs is high. Thus there is great room for improving the circuit complexity, total number of gates and garbage outputs, delay and quantum cost. For efficient designing of a reversible circuit several criteria are needed to be considered: - a. Minimize the number of gates as possible. - b. Minimize the quantum cost of the circuit. - c. Total number of garbage outputs and usage of constant inputs should be minimized. By maintaining the above parameters and observing the previous design, a novel logarithmic Reversible Barrel Shifter has been proposed. The proposed barrel shifter is a left rotating shifter which uses Fredkin gates for reversible (2:1) multiplexing and Feynman Gates for producing fan outs. A (4, 2) logarithmic barrel shifter has been illustrated in Figure 4. The circuit uses a total of 6 Fredkin gates, 4 Feynman gates and produces 6 Garbage outputs. The Quantum cost of the circuit has also been evaluated. The calculation shows that the Quantum Cost of the proposed (4, 2) circuit is 34. Figure 4. Proposed Reversible (4, 2) Barrel Shifter ### IV. DESIGN OF REVERSIBLE BIDIRECTIONAL BARREL SHIFTER The proposed design of reversible bidirectional barrel shifter can perform logical right shifting, arithmetic right shifting, rotating right, logical left shifting, arithmetic left shifting and rotating left operations. The proposed reversible bidirectional arithmetic and logical barrel shifter design approach is illustrated as shown in Fig. 5 with an example of a (8,3) barrel shifter. The barrel shifter performs the various operations such as logical right shift, logical left shift, rotate left etc. depending on the values of sra, sla rot and left control signals. Table I shows that for different values of control signals sra, sla, rot and left the operations that can be performed by a (8,3) reversible bidirectional arithmetic and logical shifter. $\begin{array}{c} TABLE\ I \\ Operation\ performed\ by\ a\ (n,k)\ reversible\ bidirectional \\ Barrel\ shifter \end{array}$ | Operation performed | Control signal values | | | | | |------------------------|-----------------------|-------|-------|-------|--| | Logical right shift | Left=0 | Rot=0 | Sra=0 | Sla=0 | | | Arithmetic right shift | Left=0 | Rot=0 | Sra=1 | Sla=0 | | | Rotate right | Left=0 | Rot=1 | Sra=0 | Sla=0 | | | Logical left shift | Left=1 | Rot=0 | Sra=0 | Sla=0 | | | Arithmetic left shift | Left=1 | Rot=0 | Sra=0 | Sla=1 | | | Rotate left | Left=1 | Rot=1 | Sra=0 | Sla=0 | | In this design, the input data is represented as i7, i6, i5, i4, i3, i2, i1, i0 while the shift value is controlled by select signals represented as S2S1S0 and the output data is obtained as shown in Table II. $\label{eq:TABLE II} TABLE \ II$ Shift and rotate operation output $\ for \ k=3$ | Operation | Y | |------------------------------|------------------| | 3-bit shift right logical | 0 0 0 a7a6a5a4a3 | | 3-bit shift right arithmetic | a7a7a7a6a5a4a3 | | 3-bit rotate right | a2a1a0a7a6a5a4a3 | | 3-bit shift left logical | a4a3a2a1a0 0 0 0 | | 3-bit shift left arithmetic | a7a3a2a1a0 0 0 0 | | 3-bit rotate left | a4a3a2a1a0a7a6a5 | The design of a reversible barrel shifter can be divided into six modules: (i) Data reversal control unit-I, (ii) Arithmetic right shift control unit, (iii) Shifter or rotation unit which consists of three sub-modules that performs Stage I, Stage II and Stage III operations, (iv) Rotation unit, (v) Arithmetic left shift control unit, (vi) Data reversal control unit-II. The reversible design of the modules of the reversible bidirectional barrel shifter along with their working are explained as follows: #### 1. Data Reversal Control Unit-I In reversible barrel shifter the direction of the shift operation performed is controlled by the control signal left as shown in the Table I. The reversible bidirectional barrel shifter performs the shift operation in the left direction if the value of control signal left as 1, that is, the arithmetic left shift operation or logical left shift operation. Otherwise, the shift operation is performed in the right direction for the value of left=0, that is, arithmetic right shift operation or logical right shift operation. The data reversal control unit-I has Fredkin gates, since two outputs of the Fredkin gate can work as 2:1 MUXes. 4 Fredkin gates can be used to reverse the 8 bit input data by utilizing two outputs of the Fredkin gate as 2:1 Muxes. A left shift operation for a n bit input data by k-bit can be performed in three steps: - (i) reverse the input data, - (ii) perform k bit right shift operation, and - (iii) reverse the outputs of the step (ii). For example, for a 8-bit input data i7, i6, i5, i4, i3, i2, i1, i0 the three steps of logical left shift operation by 3 bits will be: (i) reverse i7, i6, i5, i4, i3, i2, i1, i0 to produce i0, i1, i2, i3, i4, i5, i6, i7, (ii) perform the 3 bit logical right shift operation to produce 0, 0, 0, i0, i1, i2, i3, i4, and (iii) reverse the outputs of step (ii) to yield i4, i3, i2, i1, i0, 0, 0, 0. The date reversal control unit-I is shown in Fig. 5. #### 2 Arithmetic Right Shift Control Unit The reversible arithmetic right shift control unit is shown in Fig.5. The arithmetic right shift operation is controlled by the arithmetic right shift control unit. The designing of this unit is done using a single Fredkin gate controlled by the control signal sra, and preserves the sign bit of input data. The arithmetic right shift operation is performed if the value of control signal sra = 1, otherwise it simply passes the data to the next module. Multiple copies of the sign bit are created using the Feynman gates because fanout is not allowed in reversible logic. Fig. 5. Proposed (8,3) reversible bidirectional barrel shifter \*FE represents Feynman Gates, FR represents Fredkin gates and G represents the garbage outputs #### 3 Shifter Or Rotation Unit The three stage design of the reversible shifter or rotation unit id as shown in Fig. 5. The amount of shift operation that has to be performed is done by the shifter unit in the design of reversible bidirectional barrel shifter. This unit is controlled by the control signals S2, S1 and S0. This unit can be divided into three stages. Depending on the value of control signal S2, S1 and S0, the first, second and the third stages of this unit right shifts the input data by 2<sup>2</sup>, 2<sup>1</sup> and 2° bits respectively. All the three stages are designed using the chain of 8 Fredkin gates controlled by the control signals S2, S1 and S0. The Feynman gates are used in the design to avoid the fanout problem. The working of the three stages of the shifter unit is explained as follows: • Stage – I: The first stage of shifter unit is controlled by the control signal S2 and it will shift the input data by 2²-bits. The input data is right shifted by 2²-bits if the value of control signal S2 is - 1, else the input data remains unchanged. The outputs of the Stage I is passed as inputs to Stage II of the shifter unit. - Stage II: The second stage of the shifter unit is controlled by the control signal S1 and it works on the outputs of the first stage. The input data provided to the second stage is right shifted by 2¹-bits if the value of control signal S1 is 1, else the input data remains unchanged. The outputs of the Stage II is passed as inputs to Stage III of the shifter unit. - Stage III: The third stage of the shifter unit is controlled by the control signal S0. The output data generated by the stage-II is right shifted by 2° -bits If the value of control signal S0 is 1 else the output data remains unchanged. The outputs of this stage is passed as inputs to the next module in the design of reversible bidirectional barrel shifter. #### 4 Rotation Unit The rotation unit is shown in Fig.5. The rotation operation is controlled by the rotation unit. The designing of this unit is done using a chain of 8 Fredkin gates and controlled by the control signal rot, and performs the rotation operation of input data. The rotation operation is performed if the value of control signal rot = 1, otherwise it simply passes the data to the next module. #### 5 Arithmetic Left Shift Control Unit The arithmetic left shift control unit is shown in the Fig. 5. The design of the arithmetic left shift control unit and the design of the arithmetic right shift control unit are same. This control unit is controlled by the control signal sla and is responsible to perform the arithmetic left shift operation. This unit is implemented using a single Fredkin gate. This unit preserves the sign bit needed to perform the arithmetic left shift operation if the value of control signal sla = 1, else it simply passes the LSB of the shifter or rotation unit. #### 6 Data Reversal Control Unit II The data reversal control unit is controlled by the control signal left. If the value of control signal left is 1, this unit reverses its input data to generate a left shifted result else it simply passes the input data to its outputs. The data reversal control unit II reverses its 8 bit input which consists of 1 bit from the output of the arithmetic left shift control unit and 7 bits from the outputs of the shifter unit. The design of this unit is shown in Fig. 5 which is same as explained for data reversal control unit I. The (8,3) reversible bidirectional arithmetic and logical barrel shifter uses 32 Feynman gate to copy the input data to avoid the fanout, and 41 Fredkin gates are used for arithmetic and logical bidirectional shifting and rotating. The above design of the (8,3) reversible bidirectional barrel shifter can be generalized to design a (n,k) reversible bidirectional barrel shifter. #### V. PERFORMANCE ANALIZATION To avoid the fanout problem, in the proposed design Feynman gate is used. Chains of n/2 Fredkin gates are used in data reversal unit-I and data reversal unit-II. The arithmetic right shift control unit uses one Fredkin and $2^k$ -1 Feynman gates. Chain of n Fredkin gates and n Feynman gates are used in shifter or rotation unit at each stage. Rotation unit $2^k$ fredkin gates for m=0 to (k-1) for each stage. One Fredkin gate and one Feynman gate is used in arithmetic left shift control unit. Thus the total number of Fredkin gates used to design a (n,k) reversible bidirectional barrel shifter can be written as: FR= Number of Fredkin gates used in data reversal control unit-I+ Number of Fredkin gates used in arithmetic right shift control unit+ Number of Fredkin gates used in shifter or rotation unit+ Number of Fredkin gates used in rotation unit+ Number of Fredkin gates used in arithmetic left shift control unit+Number of Fredkin gates used in data reversal control unit-II= $n/2+1+(n*k)+\sum_{m=0}^{k-1}2^m+1+n/2$ $$=\sum_{m=0}^{k-1} 2^m + n^*(k+1) + 2.$$ The total number of Feynman gates used to design a (n,k) reversible bidirectional barrel shifter is: FE=Number of Feynman gates required to design arithmetic right shift control unit+ number of Feynman gates used in shifter or rotation unit+ number of Feynman gates used in arithmetic left shift control unit= $(2^k - 1)+(n^*k)+1$ . Ancilla input Bits The table III shows the number of ancilla bits required to design a reversible bidirectional barrel shifter for different values of n and k. $2^k+(n*k)$ Feynman gates are required to design a (n,k) reversible bidirectional barrel shifter. Each Feynman gate requires one ancilla input bit to copy the input data. Additionally, the Fredkin gate used in arithmetic right shift control unit requires one ancilla bit. Hence the total number of ancilla inputs (ANs) required to design a (n,k) reversible bidirectional arithmetic and logical barrel shifter is $ANs=2^k+(n*k)+1$ . Table III shows that the total number of ancilla inputs required to design a (8,3) reversible bidirectional barrel shifter are 33 which is same as illustrated in Fig. 5. TABLE III ANCILLA INPUTS IN (N,K) REVERSIBLE BIDIRECTIONAL BARREL SHIFTER | n/k | n=4 | n=8 | n=16 | n=32 | n=64 | |-----|-----|-----|------|------|------| | K=2 | 13 | 21 | 37 | 69 | 133 | | K=3 | | 33 | 57 | 105 | 201 | | K=4 | | | 81 | 145 | 273 | | K=5 | | | | 193 | 353 | | K=6 | | | | | 449 | Quantum Cost Table IV shows the quantum cost for a reversible bidirectional barrel shifter for different n and k values. The number of Feynman and Fredkin gates used will decide the quantum cost of (n,k) reversible bidirectional barrel shifter. The quantum cost of the Feynman gate is considered as one, while the quantum cost of the Fredkin gate is considered as five. Hence the quantum cost of the proposed design of (n,k) reversible bidirectional barrel shifter can be calculated as QuantumCost = 5 \* (number of Fredkin gates) + (number of Feynman gates). The quantum cost(QC) of the (n,k) reversible bidirectional barrel shifter can be represented as QC=5 \* $$(\sum_{m=0}^{k-1} 2^m + n^*(k+1) + 2) + 2^k + (n^*k)$$ The quantum cost of a (8,3) reversible bidirectional barrel shifter shown in Fig. 5 is 237. TABLE IV QUANTUM COST OF (N,K) REVERSIBLE BIDIRECTIONAL BARREL SHIFTER | n/k | n=4 | n=8 | n=16 | n=32 | n=64 | |-----|-----|-----|------|------|------| | K=2 | 137 | 165 | 301 | 573 | 1117 | | K=3 | | 237 | 421 | 789 | 1525 | | K=4 | | | 565 | 1029 | 1957 | | K=5 | | | | 1317 | 2437 | | K=6 | | | | | 3013 | Garbage Outputs For different reversible bidirectional barrel shifter designs the number of garbage outputs produced is as shown in Table V. In the table, n is the number of input data bits and k represents the shift value. In the design of (n,k) reversible bidirectional barrel shifter the shifter unit can be designed in k stages and each stage consists of the chain of n Fredkin gates to perform the shift operation. Each Fredkin gate in the chain of n Fredkin gates produces atleast one garbage output except the last Fredkin gate which produces two garbage outputs. Two garbage outputs are produced by Fredkin gate which is used in the design of arithmetic left shift control unit and arithmetic right shift control unit. One garbage output is produced by last Fredkin gate of the data reversal control unit-II as the control signal left cannot be utilized further. Hence the number of garbage outputs (GOs) required to design a (n,k) reversible bidirectional arithmetic and logical shifter can be written as GOs=k(n + 1) + 6+ $\sum_{m=0}^{k-1} 2^m$ . In (8,3) reversible bidirectional barrel shifter design the number of garbage outputs produced in Fig. 5 are 40 which is equal to the result in Table V. TABLE V GARBAGE OUTPUTS IN (N,K) REVERSIBLE BIDIRECTIONAL BARREL SHIFTER | | BIBINEE HOLVIE BANKEE SHILLER | | | | | |-----|-------------------------------|-----|------|------|------| | n/k | n=4 | n=8 | n=16 | n=32 | n=64 | | K=2 | 19 | 27 | 43 | 75 | 139 | | K=3 | | 40 | 64 | 112 | 208 | | K=4 | | | 89 | 153 | 281 | | K=5 | | | | 202 | 362 | | K=6 | | | | | 459 | #### **CONCLUSIONS** In this paper An Efficient Design of Reversible Logic Based Bidirectional Barrel Shifter has been proposed. The design of the proposed bidirectional shifter is done using Fredkin gates and Feynman gates. The number of garbage outputs, the number of ancilla inputs and the quantum cost of the (n,k) reversible bidirectional barrel shifter increase more rapidly by varying n and keeping k as a constant compared to the designs in which n is kept as a constant while k is varied. The functional verification of the proposed design of the reversible barrel shifters are performed through simulations using the Verilog HDL flow for reversible circuits. The design of bidirectional barrel shifter is been evaluated in terms of garbage outputs, ancilla inputs and the quantum cost. The proposed design of reversible bidirectional barrel shifter can perform logical right shifting, arithmetic right shifting, rotating right, logical left shifting, arithmetic left shifting and rotating left operations. #### REFERENCES - [1] E. Fredkin and T. Toffoli, "Conservative logic," International J. Theor. Physics, vol. 21, pp. 219–253, 1982. - [2] D. Maslov and D. M. Miller, "Comparison of the cost metrics for reversible and quantum logic synthesis," http://arxiv.org/abs/quantph/0511008, 2006. - [3] C.H. Bennett, "Logical reversibility of computation," IBM J. Research and Development, vol. 17, pp. 525–532, Nov. 1973. - [4] R. Landauer, "Irreversibility and heat generation in the computational process," IBM J. Research and Development, vol. 5, pp. 183–191, Dec.1961. - [5] A. Peres, "Reversible logic and quantum computers," Phys. Rev. A, Gen. Phys., vol. 32, no. 6, pp. 3266–3276, Dec. 1985. - [6] T. Toffoli, "Reversible computing," MIT Lab for Computer Science, Tech. Rep. Tech memo MIT/LCS/TM-151, 1980. - [7] W. N. Hung, X. Song, G.Yang, J.Yang, and M. Perkowski, "Optimal synthesis of multiple output boolean functions using a set of quantum gates by symbolic reachability analysis," IEEE Trans. Computer-Aided Design, vol. 25, no. 9, pp. 1652–1663, Sept. 2006. - [8] S. Kotiyal, H. Thapliyal, and N. Ranganathan "Design of a reversible bidirectional barrel shifter," in Proceedings of the 11th IEEE International Conference on Nanotechnology, Portland, Oregon, USA, Aug 2011, pp. 463-468. - [9] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information. New York: Cambridge Univ. Press, 2000. - [10] S. Kotiyal, H. Thapliyal, and N. Ranganathan, "Design of a ternary barrel shifter using multiplevalued reversible logic," in Proceedings of the 10th IEEE International Conference on Nanotechnology, Seoul, Korea, Aug. 2010, pp. 1104–1108. - [11] S.Gorgin and A. Kaivani, "Reversible barrel shifters," in Proc. 2007 Intl. Conf. on Computer Systems and Applications, Amman, May 2007, pp. 479–483. - [12] V. Vedral, A. Barenco, and A. Ekert, "Quantum networks for elementary arithmetic operations," Phys. Rev. A, vol. 54, no. 1, pp. 147–153, Jul 1996. - [13] H. Thapliyal and N. Ranganathan, "Design of reversible sequential circuits optimizing quantum cost, delay and garbage outputs," ACM Journal of Emerging Technologies in Computing Systems, vol. 6, no. 4, pp. 14:1–14:35, Dec. 2010. - [14] N. Nayeem, M. Hossain, L. Jamal, and H. Babu, "Efficient design of shift registers using reversible logic," in 2009 International Conference on Signal Processing Systems, may 2009, pp. 474 –478. - [15] I. Hashmi and H. Babu, "An efficient design of a reversible barrel shifter," in VLSI Design, 2010. VLSID '10. 23rd International Conference on, Jan 2010, pp. 93 –98. - [16] H. Thapliyal and N. Ranganathan, "Design of efficient reversible logic based binary and bed adder circuits," To appear ACM Journal of Emerging Technologies in Computing Systems, 2011. - [17] M. H. Khan and M. A. Perkowski, "Quantum ternary parallel adder/subtractor with partiallylook-ahead carry," vol. 53, no. 7, 2007, pp. 453 – 464. - [18] J. A. Smolin and D. P. DiVincenzo, "Five two-bit quantum gates are sufficient to implement the quantum fredkin gate," Physical Review A, vol. 53, pp. 2855–2856, 1996. - [19] H. Thapliyal and N. Ranganathan, "Design of efficient reversible binary subtractors based on a new reversible gate," in Proc. the IEEE Computer Society Annual Symposium on VLSI, Tampa, Florida, May 2009, pp. 229–234. ### **Dual Frequency Microstrip Antenna For Wireless Applications And Its Reconfigurable Counterpart** #### Arpita Sen & Neela Chattoraj Electronics and Communication Engineering Department Birla Institute of Technology, Mesra, Ranchi-835215, INDIA E-mail: sen arpita2k@yahoo.co.in& nila.chwdhry@gmail.com Abstract - The design of two microstrip patch antennas is reported which can be operated at GPS (Global Positioning System) and Bluetooth frequencies. In one, very small thin microstrip antenna is excited by a co-axial SMA connector to produce centre frequency of a Bluetooth system and GPS. Then a thin parasitic microstrip patch is coupled with this patch to excite the centre frequency of GPS system. The simulated results using IE3D software are supported by measurement. In the other, two patches are excited by co-axial SMA connectors alternately (either manually or electronically), to produce centre frequency of a Bluetooth system and GPS. Keywords — Microstrip patch antenna, dual frequency, compact, multi-frequency, Bluetooth, GPS, reconfigurable. #### I. INTRODUCTION Wireless communication systems are evolving toward multi-functionality. Modern communication systems demand transmitters and receivers with multi-band operation, as a result, numerous techniques for achieving frequency reconfigurability have been proposed in system where weight and area are critical issues. This multi functionality provides users with options of connecting to different kinds of wireless services for different purposes at different times. Large numbers of antennae are mounted on ships, aircrafts or other vehicles; it is highly desirable to develop single radiating element having capabilities of performing different functions and/or multi-band operation in order to minimize the antennae's weight and area. Although multiband/multi-frequency antennas can be used in different wireless systems, they lack the flexibility to accommodate new services when compared with reconfigurable antennas which can be considered as one of the key advances for future wireless communication transceivers. An antenna that possesses the ability to modify its characteristics, such as operating frequency, polarization or radiation pattern, in real time condition is referred to as a reconfigurable antenna. Reconfigurable antennae can be simply used as: • They can be a cheaper alternative to traditional - adaptive arrays or they can be incorporated into adaptive arrays to improve their performance by providing additional degrees of freedom. - Reconfigurability in antennae allows us for spectrum reallocation in multi-band communication systems and dynamic spectrum management - Reduces the number and size of antennae in a system. In multiband antennas, if we want to add a new frequency, we might have to change the whole geometry but in case of reconfigurable antennas, we might just need to add a component along with a switchable device that will enable the required frequency. Reconfigurability can be obtained using following techniques: - Tunable elements in the feeding networks - · Adaptive matching networks - Phase shifters and tunable filters - Tunable elements embedded such as PIN diodes, MEMS (switches, varactors, moveable parts) and optical switching in the radiating elements - Mechanically moveable radiating elements. In this paper we will be dealing with nonreconfigurable as well as frequency-reconfigurable antenna, where the frequency of the operation band can be tuned / switched to different frequencies. Two important frequency bands for wireless communications - Global Positioning System or GPS (1.575 GHz) and Bluetooth (2.4 GHz - 2.484 GHz) and its uses have been studied here. We have used switches. Ideally, the switches will have two operational states, ON and OFF. The ON state represents a short circuit, while the OFF state exhibits an open circuit. When one of the switches is in the ON state, the antenna resonates at one frequency band. And, when the other switch is turned ON, the antenna accordingly resonates at the other frequency band. Clearly, the operating frequency is controlled by the state of the switch operation. #### II. ANTENNA DESIGN AND MEASUREMENT DUAL FREQUENCY ANTENNA Very small thin microstrip antenna is excited by a co-axial SMA connector to produce centre frequency of a Bluetooth system (2.442 GHz) and the correct feed position is determined using IE3D simulation for impedance matching. Then a thin parasitic microstrip patch is coupled with this patch to excite the centre frequency of GPS system (1.575 GHz). The antenna geometry is shown in Fig.1. The simulated results are verified by measurement after fabricating the antenna on Glass Epoxy substrate of dielectric constant of 4.36, substrate height of 1.57 mm, loss tangent of 0.001. The separation between driven patch and parasitic patch was 0.5 mm. The centre of parasitic patch is shifted upward from the centre of driven patch by a distance of 11 mm. The driven patch was fed by a co-axial SMA connector at a distance of 1.3 mm downward from the centre of the patch and at the edge of the driven patch as shown in Fig. 1. These dimensions are optimum dimensions for which the best performances of the antenna at both the frequencies were achieved in simulation. For antenna design, IE3D simulation software is used, which is full wave electromagnetic simulation software for the microwave and millimeter wave integrated circuits. The primary formulation of the IE3D software is an integral equation obtained through the use of Green's function. The simulation using IE3D, takes into account the effect of co-axial SMA connector, by which the antenna was fed. The simulated results for radiation patterns of the antenna at GPS and Bluetooth frequencies are shown in Fig.2. **Figure 1.** Dual-frequency microstrip antenna (all dimensions are in mm) Figure 2. Simulated radiation pattern of the antenna The simulated plot for gain of the antenna is shown in Fig. 3. The gains of the antenna at GPS frequency and Bluetooth frequency were 3.3 dBi and 4.0 dBi respectively. The simulated reflection coefficient is compared with the measured results and plotted in Fig. 4. The measurements were done using vector network analyzer (Vector Network Analyzer, PNA N5230A, Agilent Technologies). The simulated resonance frequencies are 1.575 GHz and 2.426 GHz and measured resonance frequencies are 1.55 GHz and 2.442 GHz respectively. The simulated bandwidths at these frequencies were 20 MHz and 40 MHz whereas measured bandwidths were 30 MHz and 45 MHz respectively. Figure 3. Simulated gain of the antenna **Figure 4.** Simulated and measured reflection coefficients of the antenna **Figure 5.** Variation of gain (Simulated) with the relative shift (along y-axis) of parasitic patch with respect to the driven patch The ground plane dimensions of the fabricated antenna were 70 mm X 25 mm. The antenna radiates in the broadside direction at both the frequencies. The measured gains of the antenna at GPS frequency and Bluetooth frequency were 2.9 dBi and 3.5 dBi respectively. The variation of gain with the relative shift of parasitic patch with respect to the driven patch is shown in Fig. 5. The variations of gains of the antenna at GPS and Bluetooth frequencies with the shift of the parasitic patch in the upward direction (along y-axis) with respect to the driven patch are shown in Fig. 5. It shows that maximum gains at both the frequencies were obtained when the centre—to- centre shift was 11 mm. The 'displacement along y-axis' in Fig. 5, is zero when centres of the two patches coincide, positive when the parasitic patch shifted upward and negative when the parasitic patch shifted downward. From the parametric studies, it is found that relative position between the driven patch and the parasitic patch is important to achieve impedance matching and maximum gain at both the frequencies. Figure 6 shows the current distribution at the frequencies 1.575 GHz and 2.45 GHz. **Figure 6.** Current Distribution at 1.575 GHz and 2.45 GHz Figure 7. VSWR of the antenna Figure 7 shows the VSWR at the frequencies 1.575 GHz and 2.45 GHz and the bandwidths (at VSWR = 2) are 0.002 GHz and 0.010 GHz respectively. Figure 8. Antenna Design and Measurement #### MANUALLY RECONFIGURABLE ANTENNA Two very small thin patches are excited by co-axial SMA connector to produce centre frequency of a Bluetooth system (2.442 GHz) and GPS system (1.575 GHz). The antenna geometry is shown in Fig.9. The simulated results are taken on Glass Epoxy substrate of dielectric constant of 4.36, substrate height of 1.57 mm, loss tangent of 0.001. The patches were fed by co-axial SMA connector alternately at a distance of 1.2 mm downward from the centre and at the edge of the patches as shown in Fig. 9. These dimensions are optimum dimensions for which the best performances of the antenna at both the frequencies were achieved in simulation. **Figure 9.** Manual Reconfigurable Dual-frequency microstrip antenna (all dimensions are in mm) **Figure 10.** Simulated radiation pattern of the antenna at GPS and Bluetooth frequencies The simulated plot for gains of the antenna is shown in Fig. 11. The gains of the antenna at GPS frequency and Bluetooth frequency were 2.0 dBi and 4.0 dBi respectively. **Figure 11.** Simulated gain of the antenna at GPS and Bluetooth The simulated reflection coefficient is given in Fig. 12. **Figure 12.** Simulated reflection coefficients of the antenna at GPS and Bluetooth ### ELECTRONICALLY RECONFIGURABLE ANTENNA Two very small thin patches should be excited by co-axial SMA connector and attached to PIN diodes with different exciting frequency, to produce centre frequency of a Bluetooth system (2.442 GHz) and GPS system (1.575 GHz). The proposed antenna geometry is shown in Fig.13. **Figure 13.** Electronically reconfigurable Dual-frequency microstrip antenna (all dimensions are in mm) #### III. CONCLUSIONS The design and performance of some dualfrequency microstrip antennas for the application in GPS and Bluetooth systems are described here. The simulated results are verified by measurement. We have compared 3 geometries for: - Non-reconfigurable - Manually reconfigurable and - Electronically reconfigurable Multi-frequency antennas, like the proposed antenna that supports GPS and Bluetooth in the same device, have a large number of uses like: - Bluetooth GPS transmitter where it can broadcast position data to a paired Bluetooth receiver - Mobile Marketing that gives the entrepreneur the advantage of geo-location and sending locationspecific messages to users, using GPS and Bluetooth technology There might be requirement where both the frequencies are needed at the same time with no room to select between multiple frequencies or it may require operation of selective frequency at a time. In case one wants to add another frequency band to this antenna, the reconfigurable ones will be a better option since, s/he will just need to add another patch and excite it either manually or electronically. But, adding another frequency band to the non-reconfigurable antenna might disrupt the existing frequencies. Thus, while selecting an antenna, the user needs to prioritize the operations and accordingly go for the right antenna. #### ACKNOWLEDGMENT Authors would like to acknowledge Birla Institute of Technology, Mesra, India, for providing them the opportunity to work on this paper #### REFERENCES - Yang, F., X-X Zhang, X. Ye and Y. Rahmat-Samii, "Wide-band E-shaped patch antennas for wireless communications", IEEE Trans. Antennas and Propagat., Vol. AP-49, 1094– 1100, 2001. - Roy, J. S., N. Chattoraj and N. Swain, "Short circuited microstrip antennas for multi-band wireless communications", Microwave and Optical Technology Letts, Vol. 48, 2372 – 2375, 2006. - 3. Jing, X., Z. Du and K. Gong, "A compact multiband planar antenna for mobile handsets", IEEE Antennas and Wireless Propagation Letters, Vol. 5, 343 345, 2006. - Roy, J. S. and M. Thomas, "Design of a circularly polarized microstrip antenna for WLAN", Progress In Electromagnetic Research (PIER) M, Vol. 3, 79 – 90, 2008. - 5. Garg, R., P. Bhartia, I. J. Bahl and A. Ittipiboon, Microstrip Antenna Design Handbook, Artech House, 2001. - 6. Wong, K. L , Compact and Broadband Microstrip Antennas, Wiley, 2002. - Kumar, G. and K. P. Ray, Broadband Microstrip Antennas, Artech House, Boston, 2003. - Qin, W., "A novel patch antenna with a T-shaped parasitic strip for 2.4/5.8 GHz WLAN applications", Journal of Electromagnetic Waves and Applications, Vol. 21, No. 15, 2311 – 2320, 2007. - Proc. Workshop on Printed Circuit Antenna Technology. New Mexico State Univ., Las Cruces, New Mexico 17-19, 1979. - Y. T. Lo, D. Solomon and W. F. Richards, "Theory and Experiments on Microstrip Antennas", IEEE Trans. on Antennas and Propagation, vol. AP-27,pp.137-145, 1979. - I. J. Bahl and P. Bhartia, "Microstrip Antennas", Artech House, Dedham, Mass 1980. - Special Issue on Microstrip Antennas, IEEE Trans. on Antennas and Propagation, January, 1981. - 13. R. Garg, P. Bhartia, I. Bahl & A. Ittipiboon, "Microstrip Antenna Design Handbook", Artech House, 2001. - 14. I. J. Bahl and P. Bhartia, "Microstrip Antennas", Artech House, Dedham, Mass 1980 # Improving the Combustion Characteristics of Pine Needle Charcoal Briquettes #### Vinita Bharti & Mamta Awasthi Department of Energy & Environment, NIT, Hamirpur, H.P.177005 E-mail: vinitabharti28@gmail.com, awasthi6@rediffmail.com Abstract - Forest fires are the annual phenomenon in Himachal Pradesh as the forests are dense and catch fire easily due to natural and man-made reasons. Himalayan forests are rich in Pine trees (Pinus roxburghii Sarg.) and the pine needle is one of the reasons to enhance the forest fire. There is a necessity to handle this forest waste efficiently. This paper attempts to explain the briquetting technology of pine needles and the results showed that a significant improvement in calorific value. Keywords - pine needles, charcoal briquetting technology, proximate analysis, calorific value, water boiling test. #### I. INTRODUCTION India is extremely dependent on the use of fossil fuels, mainly coal and petroleum. The dependency is even more evident in the semi-urban areas. On the one hand, several small-scale industries such as brick kiln manufacturers, industrial boilers, and food processing pharmaceutical industries use coal thermal/heating purposes, which results in high greenhouse gas emissions, which are responsible for climate change. On the other hand, institutional kitchens use expensive and polluting LPG (liquefied petroleum gas). However, the rural population in India does not have access to reliable energy. The main source of energy for this section of the society is the use of firewood. The people living in rural areas burn firewood inefficiently (mainly for cooking), which causes indoor air pollution and releases harmful black smoke. Millions of tons of biomass gets generated from forest residues especially pine needles in himachal Pradesh [4]. These pine needles if not removed from the ground can cause lot of damage to the environment. Firstly due to their highly inflammable nature, they often become the cause for forest fires in the Himalayan forests. Pine tree trunk is heat resistant, hence in case of a forest fire [4, 5], pine trees survive the fire but in the process destroy the growth of other plant species whose produce provide sustenance to villagers and thus also disturb the ecological balance of the region. Secondly, dry pine foliage stops water from being absorbed by the soil and thus causes the depletion of ground water table [4]. Thirdly, fallen dry pine foliage acts like a carpet on the forest floor and blocks the sunshine reaching ground and thereby stops the growth of grass which the cattle feed upon. Although dry pine needles and other forest residues have high caloric value, this biomass cannot be used directly due to its low bulk density and high moisture content. ### BIOMASS CHARCOAL BRIQUETTING TECHNOLOGY: Biomass briquetting is the process of converting low bulk density biomass into high density and energy concentrated fuel briquettes. The biomass charcoal briquetting technology developed at MCRC [3] uses the modified kiln and a briquetting machine that can fabricate locally to produce bio-char from various biomass samples. The technology involves use of a cost effective binder to prepare the briquettes. Biomass charcoal Briquetting process: Experimental #### II. MATERIAL AND METHOD: Sample collect were pine needle (*Pinus roxburghii*) from Hamirpur district, himachal Pradesh India. Preparation of the biomass: The biomass collected were air dried for ten days to reduce moisture content of the material. The material cut into small pieces (2mm-size). These materials were than processed for the determination their proximate analysis in chemistry department of the NIT Hamirpur. Proximate Analysis of the Material: The moisture content, ash, volatile matter, and fixed carbon of the pine needles were determined by using standard method ASTM D-3173. The calorific value of pine needles and briquette samples was determined using microprocessor bomb calorimeter of model CC01/M3. Preparation of the briquette samples: Material required: forest wastes (pine needles), binding materials (brown clay and rice starch), Charcoal kiln/drum and Briquetting mould/machine. Carbonization: For carbonization, loosely pack the collected biomass into the kiln. The kiln will accommodate ~100kg dry biomass. After loading the biomass into the kiln close top of the kiln with metal lid attached to a conical chimney. Use little amount of biomass in the firing portion to ignite in the kiln and close the doors tightly and fired for 45 minutes to1hr (depending upon the biomass) using biomass [3]. In the absence of air, the burning process is slow and the fire slowly spreads to the biomass though the hole in the perforated sheets. In this method 30 % of carbonized char can be obtained [3]. Fig1: carbonizer drum Table 1: Time cycle of one batch carbonization | Process duration | | | |-----------------------|------------|--| | Loading | 30 minutes | | | carbonization | 60 minutes | | | Cooling | 5 hours | | | Unloading | 30 minutes | | | Total processing time | 7 hours | | Binder preparation and mixing: A binder is used for strengthening the briquettes. The carbonized char powder can be mixed with different binders (100 kg of char +5kg of starch) such as commercial starch, rice powder, rice starch (rice boiled water) and other cost effective materials like brown clay soil. Binder mixed with water and boiled for 20 minutes [3]. After boiling the liquid solution is poured into char powder and mixed to ensure that every particle of carbonized charcoal material is coated with binders. It enhances charcoal adhesion and produce identical briquettes. Fig. 2: Binder preparation and carbonized charcoal mixed with the binder Briquettes: The charcoal mixture is made into briquettes either manually or using machines. Pour the mixture directly into the briquetting mould/ machine to form uniform sized cylindrical briquettes. Fig3: Briquette samples Table 2: Char and binder material used in biocoal briquettes | sample | char | Binding material | |--------|------|--------------------------------| | S1 | 1kg | 500g clay | | S2 | 1kg | 333g clay | | S3 | 1kg | 50g rice starch + 100g<br>clay | | S4 | 1kg | 50g starch | Characterization of the samples: Ignition time: Each briquette sample was ignited at the base in a drought free corner. The time required for the flame to ignite the briquette was recorded as the ignition time using stop watch. Water Boiling Test: This was carried out to compare the cooking efficiency of the briquettes. It measured the time taken for each set of briquettes to boil an equal volume of water under similar conditions. 185g of each briquette sample was used to boil one Liter of water using small stainless cup and domestic briquette stove [9]. During this test, other fuel properties of the briquettes like burning rate and Specific fuel consumption was also determined. Also, the level of smoke evolution was observed. Burning rate is the ratio of the mass of the fuel burnt (in grams) to the total time taken (in minute). Burning rate = Mass of fuel consumed (g) / Total time taken (min.) (1) The specific fuel consumption indicates the ratio of the mass of fuel consumed (in grams) to the quantity of boiling water (in liter). Specific fuel consumption = Mass of fuel consumed (g) / total mass of boiling water (liter) (2) Efficiency of the briquette was calculated from the formula: $$\square = M_{w.i} X C_{p.w} X (Te-Ti) + M_{w.evap} X H_l / M_f X h_f$$ (3) #### Where, h<sub>f</sub> calorific value of fuel, kj/kg. M<sub>f</sub> mass of fuel burned, kg. M<sub>w,i</sub> initial mass of water in the cooking vessel, kg. Ti - initial temperature of water in °C Te - temperature of boiling water in °C M<sub>w.evap</sub> mass of water evaporated, kg. H<sub>1</sub> - latent heat of evaporated at 100 °C in kj/kg. C<sub>p,w</sub> specific heat of water, kj/kg °C #### III. RESULTS AND DISCUSSION The results of proximate analyses of the pine needle and biomass charcoal briquettes are shown in table 3. From the results, it is clearly show that the biomass charcoal briquette sample S4 using starch as a binder has higher calorific value (6447kcal/kg) low Ash content (15.2%) and high volatile matter (73.21%) than biomass charcoal briquette samples using clay as a binder. The briquette sample S1 has high ash content the high ash content of sample S1 is an indication that it contains more mineral (non combustible) matters. Table 3: The result of proximate analysis of raw material (pine needle) and biocoal briquette samples | Sample | Pine<br>needle | S1 | S2 | S3 | S4 | |----------------------|----------------|------|------|------|------| | Moisture content (%) | 11.98 | 6.2 | 5.9 | 5.1 | 4.6 | | Ash content (%) | 5.4 | 32.6 | 29.3 | 19.2 | 15.2 | | Volatile<br>matter (%) | 67.07 | 50.7 | 55.4 | 67.41 | 73.21 | |------------------------|-------|------|------|-------|-------| | Fixed carbon (%) | 15.55 | 10.5 | 9.4 | 8.29 | 6.99 | | Calorific | | | | | | | value<br>(kcal/kg) | 4811 | 4970 | 5687 | 6343 | 6447 | The result of ignition time in fig.4 showed that the ignition time of the briquettes decrease with increase in biomass concentration. The biomass charcoal briquette sample S1took the longest time to ignite 430.00 second. It results in greater use of ignition material and consequently more smoke. But with incorporation of biomass, the ignition time dropped progressively. The ignition time of biomass charcoal Briquette using starch as a binding material was shorten than that of biomass charcoal briquette using clay as a binding material. Fig 4: The effect of binding material consentration on the ignition time of briquettes Fig 5-6 show the result of the parameters determined during the water boiling test. Briquette using clay as a binder takes more time for water boiling than briquette using starch as a binder. The reason is the higher percentage of clay than starch in Briquettes and the non-combustible character of the clay in comparison to the starch. Clay is mineral matter, does not burn. This can result in problems with cooking and fire extinction, as it blocks the stove's air ventilation. The stove needs to be shacked often in order to clean it. The briquette sample S1 took longest time to boil water (45 min) while the sample S4 took the shortest time (30 min). The burning time of briquette using starch as a binder more than briquette using clay as a binder. Fig 5: temperature evaluation during the high power phase- cold start of the water boiling test for different briquette samples Fig 6: The effect of binding material consentration on the burning time of briquettes Table 4: Some fuel characteristics during the boiling phase | | Burning | Specific fuel | Efficiency | |--------|------------|---------------|--------------| | Sample | rate | consumption | of briquette | | Sample | (g/minute) | (g/liter) | (%) | | S1 | 2.5 | 170 | 29 | | S2 | 3.0 | 155 | 30 | | S3 | 6.7 | 110 | 35 | | S4 | 10.3 | 100 | 38 | Burning rate indicate the mass of fuel burned per minute during the boiling phase. The specific consumption indicates the mass of fuel required to produce one liter of boiling water. Table 4 describes fuel characteristics of biocoal briquettes during water boiling test. As a result of the comparison of both characteristics for the 4 tested biocoal briquette samples, we can say that Biocoal briquettes made with starch burns faster and are more efficient than biocoal briquettes made with clay. Briquette sample S4 burn and boil water faster and less quantity (100g/liter) of them were required to produce one liter of boiling water compared to other briquette samples .the briquette sample S4 has the higher efficiency (38%) than other biocoal briquette samples. On the other hand, briquette sample S1 has the least cooking efficiency (29%). It burns slowly without flame, took the longest time to boil the water and much quantity (170g/liter) of it was needed to boil water. This is because of the fact that the briquette burns slowly, as a result, lots of the heat released was lost before the water boils. The burning rate (how fast the fuel burns) and the caloric value (how much heat released) are two combined factors that controlled the water boiling time. This explained why sample S4 was able to boil water and burn faster than other Samples. This means that the calorific value alone is not a single factor controlling cooking efficiency but burning rate is equally important. #### **CONCLUSIONS:** The results from this study have shown that the biomass briquettes bonded with starch are very efficient than biomass briquette bonded with clay depend on the following factor its ability to: ignite easily without any danger, generate less smoke, high calorific value, generate less ash as this will constitute nuisance during cooking and to be strong enough for safe transportation and storage. The technology has a great potential for converting waste biomass into a superior fuel for household use, in an affordable, efficient and environment friendly manner. #### REFERENCES: - [1] Mande Sanjay, Lata Kusum, "Preparation of charcoal briquettes from field and forest residues", Asia-Pacific forum for environment and development (APFED), 2004, 2-3 - [2] Font R., Conesa J.A., Molto J., Munoz M., "Kinetics of pyrolysis and combustion of pine needles and cones", J. Anal. Appl. Pyrolysis 85 (2009), 276–286 - [3] Sugumaran p., Seshadri s., "Biomass charcoal briquetting", Shri AMM Murugappa Chettiar Research Centre taramani, Channai-60011, 1-12 - [4] Access to Clean Energy, a glimpse of off grid projects in India, www.undp.org.in/sites/default/files/reports publication/ACE.pdf - [5] Quarterly magazine on biomass energy, published under the undp-gef biomass power project of ministry of new and renewable energy (MNRE), government of India. Published by winrock international India (WII), issue 2 DEC 2009 - [6] Himalayan forest research institute, shimla, hfri.icfre.gov.in - [7] P.D. Grover & S.K. Mishra, "biomass briquetting technology and practices", regional wood energy development programme in Asia gcp/ras/154/net, Field Document No.46, 1-43 - [8] Sugumaran P, Seshadri S, "Evaluation of selected biomass for charcoal production", journal of science & industrial research, Vol.68, August2009, 719-723 - [9] Kim H., kazuhiko and masayoshi S., "Bio-coal briquette as a technology for desulphurdizing and energy saving", In T. Yamada ed. Chapter34, 2001, 33-75 - [10] Onuegbu T.U., Ekpunobi U.E., Ogbu I.M., Ekeoma & Obumselu F.O., "comparative studies of ignition time and water boiling test of coal and biomass briquettes blend", IJRAS 7 (2), vol.17issue2, May2011, 153-159 - [11] Bogale Wondwossen, "preparation of charcoal using agricultural wastes", Ethip .J. Educ. & Sc., vol.5 no.1 September 2009, 82 - [12] ASTM standard E711-87, Standard test method for gross calorific value of refuse – derived fuel by the bomb calorimeter. Annual book of ASTM standard,11.04. ASTM International, http://www.astm.info/standard/E711.htm 2004 - [13] Eriksson, S. and M. Prior, "the Briquetting of Agriculture of Agricultural Wastes for Fuel", F.A.O. Publication, 1990. - [14] Sotannde O.A., Oluyege A.O, Abah G.B., "Physical and combustion properties of charcoal briquettes from neem wood residues", Int. Agrophysics, 2010, 24, 189-194. #### **Energy Reduction Techniques in MANET** #### Mrunali. S. Sonwalkar & R. S. Havinal Department of Computer Science & Engineering, M.B.E.S. College of Engineering, Ambajogai, India E-mail: mrunali.sonwalkar@gmail.com Abstract - A mobile ad hoc network (MANET) is a collection of mobile nodes. It can be constructed anywhere without any infrastructure. The mobile nodes are equipped with energy-limited batteries. As mobile nodes are battery-operated, an important issue in such a network is to minimize the total power consumption for each operation. During transmission of data, the energy associated with every node should be managed properly. In MANET each node acts as a store and forward station for routing packets. When two nodes want to communicate, they can do so directly, if they are within the radio range of each other or route their packets through other nodes. As the nodes are highly dynamic, maintaining routes become a greater challenge. In MANET the unnecessary energy is consumed due to, node overhearing, dynamic topology, retransmission due to link failure, unpredictable link properties, reconstructing paths between the nodes etc. In order to reduce energy consumption a packet must be advertised before it is actually transmitted. Multicasting is one of the fundamental mechanism ,which can be typically implemented by creating a multicast tree. Multicasting also involves an all-to-all multicast session consisting of a set of terminal nodes in an ad hoc network. Due to limited battery power and transmission bandwidth limitations, in wireless ad hoc networks, it is essential to develop efficient multicast protocols that are optimized for energy consumption and significantly improving network. Multicasting is achieved by forming minimum spanning tree between the source nodes and other mobile nodes in the network and then data is transmitted over this minimum path. In this paper we are going to analyze work done by different authors in order to resolve the minimum energy problem by implementing different approaches. Keywords - MANET, energy conservation, multicasting, minimum spanning tree. #### I. INTRODUCTION A mobile ad hoc network is a collection of mobile nodes equipped with wireless communication devices. These wireless communication devices are connected by wireless links without any central infrastructure. In Latin ad hoc means 'for this purpose only' which correctly state the meaning of mobile ad hoc networks. Ad-hoc networks are a key in the evolution of wireless networks .These networks introduced a new art of network establishment and can be well suited for an environment where either the infrastructure is lost or establishing an infrastructure is very cost effective. Each device in a MANET is free to move independently in any direction, and therefore change its links to other devices frequently. Each device must forward traffic unrelated to its own use, and therefore be a router. The primary challenge in building a MANET is equipping each device to continuously maintain the information required to properly route traffic. Such networks may operate by themselves or may be connected to the larger Internet. Each mobile nodes is operated by a limited-energy battery and usually it is impossible to recharge or replace the batteries during a mission. However, the set of network links between the mobile nodes and their capacities is not predetermined because it depends on factors such as distance between nodes, transmission power, hardware implementation and environmental noise. The communication between two mobile nodes can be either in a single hop transmission in which case the two nodes are within the transmission ranges of each other, or in a multi-hop transmission where the message is relayed by intermediate mobile nodes. It is well known that wireless communications consume significant amounts of battery power therefore; the limited battery lifetime imposes a severe constraint on the network performance. Energy conservation in such a network thus is of paramount importance, and energy efficient operations are critical to prolong the lifetime of the network. Energy conservation techniques for ad hoc networks can be broadly classified into two categories: power mode control and transmission power control. A power mode control protocol aims to put wireless nodes into periodical sleep state in order to reduce the power consumption in the idle listening mode. Transmission power control manages energy consumption by adjusting transmission ranges during actual transmission[2]. The fundamental problem in mobile ad hoc network is energy management and one of the solutions to this problem is multicast. Multicasting plays a crucial role in MANETs to support its number of applications. It is an efficient mechanism for one to many communications, and is typically implemented by creating a multicast tree. It involves the transmission of a datagram to a group of zero or more hosts identified by a single destination address, and so is intended for group-oriented computing. A multicast datagram is delivered to all members of its destination host group with the same reliability as regular unicast IP datagrams, that is, the datagram is not guaranteed to arrive intact at the destinations of all members of the group, or in the same order relative to other datagrams. The use of multicasting within MANETs has many benefits. It can reduce the cost of communication and improve the efficiency of the wireless channel when sending multiple copies of the same data by exploiting the inherent broadcasting properties of wireless transmission. Due to limited battery power and transmission bandwidth limitations, in wireless ad hoc networks, it is essential to develop efficient multicast protocols that are optimized for energy consumption and significantly improving network. Multicasting also involves an all-to-all multicast session consisting of a set of terminal nodes in an ad hoc network, where the transmission power of each node is either fixed or adjustable. The set of network nodes which may generate a multicast packet to be distributed to a multicast group are referred to as source nodes .In addition, multicasting provides a simple yet robust communication method whereby a receiver's individual address remains unknown to the transmitter or changeable in a transparent manner by the transmitter[1]. Multicasting is achieved by forming minimum spanning tree between these source nodes and other mobile nodes in the network and then data is transmitted over this minimum path which in turn manages the energy associated with the nodes. Fig 3 shows construction of minimum spanning tree with multicasting[4]. #### A. Layer Of Implementation Several multicast routing protocols for MANETs have been proposed .These protocols are based on different design principles and have different operational features when they are applied to the multicast problem MANET multicast routing protocols can be classified into various categories. We propose to classify the existing multicast protocols into three categories, according to their layer of operation, namely, the network layer, the application layer, and the MAC layer[1]. Each of which can perform specific functions for supporting multicast communication. The network layer is responsible for routing data between a source-destination pair (end-to-end), while the MAC layer is responsible for ensuring that the data are correctly delivered to the destination (reliability), which requires the application layer to buffer data locally until acknowledgments (ACKs) have been received. However, it is the responsibility of the MAC layer to support rate adaptive multicasting. #### B. Network Layer Multicasting MANET multicasting has received a great deal of attention in terms of designing efficient protocols at the network (IP) layer .Protocols in this layer require the cooperation of all the nodes of the network. They also require forwarders (intermediate) nodes to maintain their per group state. The network (IP) layer implements minimal functionality, unicast datagram service, while network implements multicast overlay functionalities such dynamic membership as maintenance, packet duplication, and multicast routing[1]. #### II. RELATED WORK A major concern in mobile ad hoc networks (MANETs) is energy conservation. It is due to the limited lifetime of batteries. A great effort has been devoted to develop energy-aware network protocols .V.Ramesh [2] proposed a new communication mechanism called RandomCast, via which a sender can specify the desired level of overhearing, making a prudent balance between energy and routing performance. RandomCast also proposed a message forwarding mechanism related with the energy and the overall network performance. In RandomCast, a node may decide not to overhear (a unicast message) and not to forward (a broadcast message) when it receives an advertisement during an ATM window, thereby reducing the energy cost without deteriorating the network performance. In addition, it reduces redundant rebroadcasts for a broadcast packet, and thus, saves more energy. RandomCast is highly energy-efficient compared to conventional 802.11 schemes, in terms of total energy consumption, energy good-put, and energy balance. Mobile ad-hoc networking involves peer-to-peer communication in a network with a dynamically changing topology. Weifa Liang[3] has considered a symmetric wireless ad hoc network. In this network, the minimum-energy multicast tree is devised by considering the number of approximations according to the total length of network. The first approximation algorithm is analyzed with an approximation ratio of 4 ln K, where K is the number of destination nodes in a multicast session. The algorithm analyses power at every node $v_i$ i.e. $p_{wi}$ . Then the range of the battery power at node $v_i$ is partitioned into a number of power intervals and each of them corresponds to a power level. It assumes, the power interval of each corresponding power node is in the range of $2^{pi} < pwi < 2^{pi+1}$ . Based on these power intervals the approximation solution calculates the optimum value with which the power associated with every node is within a constant factor and thus it is minimum. Another solution proposed in [4] by Weifa Liang at all, uses, the minimum-energy multicasting approach, that can be used by employing one-to-many communication mechanism. That is, an energy efficient multicast tree is rooted at each terminal node. One node acts as a source node from which a single multicast tree is constructed which is shared by other nodes to multicast its messages to the remaining terminal nodes. To implement this approach a wireless ad hoc network can be modeled by an undirected graph M = (N,A), where N is the set of homogeneous stationary nodes and A is the set of links. There is a link $(u, v) \square A$ if nodes u and v are within the transmission range of each other, and u and v are neighboring nodes. Although the network topology is allowed to change due to node mobility, but it remains stable during the period actual data transmission. It also assumes that each node is equipped with omnidirectional antennas and powered by energy-limited batteries. It is based on two transmission models: One is that each node has only one fixed, identical transmission power te. Another is that each node can adjust its transmission power dynamically. It also considers an all to- all multicast session with a terminal set D, such that each terminal node $v \square D \square N$ has a message of length l<sub>v</sub> to share with the others in D, the minimum-energy all to- all multicasting problem is to construct a shared multicast tree spanning the nodes in D such that the total energy consumption of realizing the all-to-all multicast session using the tree is minimized. In ad-hoc network the nodes are mobile and their topologies are dynamically changing ,due to this it becomes difficult to built a minimum spanning tree. Yongwook Choi, & Maleq Khan proposed [5] the minimum spanning tree (MST) problem, which is an important primitive in many applications in wireless networks, e.g., broadcasting, data aggregation and topology control. The data aggregation paradigms commonly use trees to schedule the transmission of data from all nodes in the graph at a source. The minimum cost spanning trees helps by optimizing the energy usage in this process. Various topology control algorithms also use MSTs to construct well connected sub graphs by finding optimum path. The author addressed minimum spanning tree problem by assuming random distribution of nodes over the Euclidean distance of the network. It uses a energy efficient distributed algorithm for Euclidean MST problem. This algorithm calculates the energy complexity by considering its lower bound and upper bound. It then uses its lower bound to construct MST when coordinate information is not known, in this case the MST is non trivial. With coordinate information, the MST constructed is trivial. It also study distributed approximation algorithms for MST that give better energy complexity by considering some additional information about the coordinates of the nodes. It also uses constant energy algorithm that gives a constant factor approximation to the MST. It then constructs a energy model which focuses over minimizing transmission energy by considering energy complexity. Hassan Artail and Khaleel Mershad [6] introduced a message forwarding algorithm for search applications within mobile ad hoc networks that is based on the concept of selecting the nearest node from a set of designated nodes. This algorithm is called as Minimum Distance Packet Forwarding (MDPF). The algorithm generates routing information to select the node with the minimum distance. The goal of the proposed algorithm is to minimize the average number of hops taken to reach the node that holds the desired data. Thus the algorithm helps to construct minimum spanning tree by considering significant nodes. Numerical analysis and experimental evaluations produced by the algorithm also helps to derive the lower and upper bounds of the interval for the hop count. It also decides the mean hop count between the source node of the data request, on one hand, and the node that holds the desired data, on the other hand. In the experimental evaluation, the performance of MDPF was compared with Random Packet Forwarding (RPF) and Minimal Spanning Tree Forwarding (MSTF). With the help of results produced by the numerical analysis, the author stated that the MDPF offers significant hop count savings and smaller delays when compared to RPF and MSTF. Majority of the distributed algorithms for constructing Minimum Spanning Tree (MST) require relatively large number of messages and time, and are fairly involved, making them impractical for resource-constrained networks such as wireless sensor networks. In such networks, a sensor has very limited power, and any algorithm needs to be simple, local, and energy efficient. Motivated by these considerations, Maleq Khan and Gopal Pandurangan[7] proposed a class of simple and local distributed algorithms called Nearest Neighbor Tree (NNT) algorithms for energy-efficient construction of an approximate MST in wireless networks. The proposed work is a detailed theoretical and experimental study of the NNT algorithms in the context of wireless ad hoc and sensor networks. First it generates NNT algorithms for the complete graph model where the maximum transmission range of the nodes are large enough so that any pair of nodes can communicate directly with each other. Depending on how the ranks of the nodes are chosen, it generates two NNT algorithms: Random-NNT (ranks are chosen randomly) and Coordinate-NNT (Co-NNT in short; ranks are based on coordinates of the nodes). For multihop wireless networks modeled by a unit disk graph (UDG), it presents another NNT algorithm, which is referred to as UDG-NNT. Given the simple and local nature of this construction, it generates trees of reasonable properties. It also shows that the NNTs have some properties that can make them attractive for the ad hoc networks. The main results derived from this algorithm are: (i) The tree produced by such an algorithm, called the NNT, has low cost, (ii) The NNT paradigm can be used to design a simple dynamic algorithm for maintaining a low cost spanning tree, and (iii) The time, message and work complexities of the NNT algorithms are close to the optimal in several settings. #### III. PRELIMINARIES We are considering a wireless ad hoc network model where an all-to-all multicast session is created which consist of a set of terminal nodes. The transmission power of each node is either fixed or adjustable. Assume that each terminal node has a message to share with each other, the task is to build a shared multicast tree spanning all terminal nodes such that the total energy consumption of realizing the all-to-all multicast session by the tree is minimized. Fig 2[1] Multicast data forwarding This can be achieved by devising approximation algorithms with guaranteed approximation ratios. The approximation algorithm then can be simplified by providing a distributed implementation of the algorithm. Finally we conduct experiments by simulations to evaluate the performance of the proposed algorithm[4]. #### A. Wireless Communication Model A wireless ad hoc network can be modeled by an undirected graph M = (N,A), where N is the set of homogeneous stationary nodes and A is the set of links with n = |N| and m = |A|. Each multicast request is a pair (S,D) where s is the source node and D is the set of destination nodes. There is an edge $(u, v) \in A$ if nodes u and v are within the transmission ranges of each other. For any edge $(u, v) \in A$ , its two endpoints u and v are called neighboring nodes. The common notations used during paper are as follows | Notations | Description | |-----------------|-----------------------------------------------------------------------------------------| | M(N,A) | The ad hoc network with node set <i>N</i> and link set <i>A</i> | | $G(V,E,\gamma)$ | Communication graph from $M(N,A)$ , $V = N$ , $E = A$ , and $\gamma \longrightarrow R+$ | | N | Number of nodes in $M(N,A)$ , $n = N $ | | M | Number of links in $M(N,A)$ , $m = A $ | | t <sub>e</sub> | Fixed transmission power at each node in $M(N,A)$ | | r <sub>e</sub> | Reception power at each node in $M(N,A)$ | | $d_{u,v}$ | Distance between nodes $u$ and $v$ | | D | Terminal set, $D \square N$ | | K | Number of terminal nodes, $k = D $ | | $l_{\nu}$ | Length of the message originated at node $v \square D$ | | T | A multicast tree in <i>M</i> spanning the nodes in <i>D</i> | | NT | Set of nodes in tree <i>T</i> | | EG(T) | Set of links (or edges) in tree T | | $T_{opt}$ | Optimal multicast tree for an all-to-all multicast session in $M(N,A)$ | | $T_{app}$ | An approximate, minimum edgeweighted Steiner tree in $G(V,E, \gamma)$ | | $T_{v}$ | A shortest path tree in $G(V,E, \gamma)$ rooted at $v$ | | E(T) | Total amount of transmission power of the nodes in <i>T</i> | | $T_D$ | The minimum energy transmission multicast tree in $M(N,A)$ | Table 1 [4] Common Notations We assume that the network topology is stable during the processing period of a multicast request, where we say processing a multicast request, means that the system either builds a multicast tree for and realizes the request using the built tree, or rejects the request if there are not enough network resources to accommodate the request. After it has finished processing the current multicast request and before its response to the next multicast request, the system allows the nodes in the network to move and a new network topology is then formed. Each node in the network is equipped with omnidirectional antenna and the transmission power at the node is finitely or infinitely adjustable. Each node can choose one of its power levels to transmit messages. In other words, we assume that there are $l_i$ power levels at node $v_i \in N$ . Let $w_i$ be the power of vi at its power level 1. Among the l<sub>i</sub> power levels, one is the minimum operational power level with power p<sub>min(vi)</sub> and another is the maximum operational power level with power p<sub>max(vi)</sub>, Furthermore, given two neighboring nodes u and v, there is always a corresponding power level between u and v with the same amount of power, which we refer to as the power level symmetry of neighboring nodes. Obviously, the amount of power to maintain the power level symmetry between u and v is the minimum power required to keep them within the ransmission range of each other. For a transmission in the network from node u to node v, separated by a distance d<sub>u,v,</sub> to guarantee that v is within the transmission range of u, the transmission power at u is modeled to be proportional to $d^{\alpha}_{u,v}$ , assuming that the proportionality constant is 1 for notational simplicity, α is a parameter that typically takes a value between 2 and 4, depending on the characteristics of the communication medium The reachability of a node in wireless ad hoc networks is fully determined by the transmission power at the node. It is assumed that the power level of a transmission node can be chosen within a given range of values. Therefore, there is a trade-off between reaching more nodes in a single hop by using higher power and reaching fewer nodes in a single hop by using lower power. Note that nodes in any particular multicast tree do not necessarily have to use the same power level, and a node may use different power levels for various multicast trees in which it participates. A wireless ad hoc network that meets the above requirements is called the symmetric wireless ad hoc network. A special case of the symmetric wireless ad hoc network is a network in which every mobile node is equipped with the same type of battery[3]. #### B. The Minimum-Energy All-To-All Multicasting Problem. Given a wireless ad hoc network M (N,A) .A multicast request consisting of a source node s and a destination set D, the minimum-energy multicast tree problem is to construct a multicast tree rooted at the source node and spanning the nodes in D such that the sum of transmission power at non leaf nodes is minimized. The problem dependent upon the choice of transmission nodes as well as the transmission power level at every chosen transmission node. Note that the leaf nodes do not contribute any transmission power consumption because they do not transmit any messages, For constructing multicast session with minimum spanning tree the approximation algorithm is used, which can be of two types (i) For fixed transmission power (ii) For adjustable transmission power[4]. Fig 2[1] (a) A network (b) Minimum spanning tree (c) Minimum spanning tree with multicast in group 1(d) Minimum spanning tree with multicast in group 2 #### C. Distributed Implementation The approximation algorithms is centralized. But it may not be applicable in practice, due to the fact that sometimes it is impossible for each node to have the topological knowledge of the entire network. Instead, each node has only the local knowledge of its neighboring nodes[4]. Based on such a distributed environment, the centralized approximation algorithm can be simplified with its distributed implementation, which is referred to as algorithm Dist\_Implement as shown in Fig 4 For convenience, we work on the communication graph $G = (V, E, \gamma)$ instead of the wireless network M(N,A). #### Algorithm Dist Implement $(V, E, D, \gamma())$ #### begin - 1. **for** each node $v \square D$ **do** - Construct a single source shortest path tree T<sub>v</sub> in G rooted at v; - Prune those branches from T<sub>v</sub> that do not contain nodes in D, and denote by T<sub>v</sub> the resulting tree if no confusion arises. - 4. Compute the weighted sum of the edges in $T_{\nu}$ and store it at $\nu$ . endfor; - 5. Find a tree $T_{\nu 0}$ rooted at $\nu 0 \square D$ from the k = |D| trees such that the weighted sum of the edges in $T_{\nu 0}$ is the minimum. Denote by $T_{app}$ as $T_{\nu 0}$ . - Let $N_{Tapp}(v)$ be the set of neighboring nodes of v in $T_{app}$ . - 6. Set the power level of each node v in $T_{app}$ by assigning its transmission power to be $\max u \square N_{Tapp}(v) \{d^2_{u,v}\}.$ #### end. Fig 1[4]. A distributed algorithm Dist\_Implement. The total energy consumption of realizing an all-toall multicast session can be achieved if an exclusive routing tree rooted at each terminal node is used to multicast its message to the other terminal nodes. Since finding such an optimal multicast tree is complicated due to node mobility, instead, a shortest path tree rooted at each terminal node and spanning the other terminal nodes will be used. This is a multiple multicast trees based shortest path algorithm for realizing all-to-all multicast sessions. Thus the energy consumption of each node is considered and then it is summed to derive the total energy associated with the network[4]. #### IV. CONCLUSION The goal of this paper is to evaluate the performance of the distributed algorithm, when the nodes are identical and fixed. It will compute the total energy consumption of an all to all multicast session with the help of distributed algorithm. The algorithm will also construct a minimum spanning tree rooted at the terminal node and spanning the all other nodes in network. The total energy consumption in the network is compared with the total number of nodes in the network. At the end, we will plot the performance matrix against the total energy consumption with percentage of nodes. #### REFERENCES [1] Osamah S. Badarneh and Michel Kadoch "Multicast Routing Protocols in Mobile Ad Hoc Networks: A Comparative Survey and Taxonomy" EURASIP Journal on Wireless Communications and Networking Volume 2009 (2009), Article ID 764047 - [2] V.Ramesh 1 Dr.P.Subbaiah2 N.Sandeep Chaitanya 3 K.Sangeetha Supriya4 "An efficient energy management scheme for mobile ad-hoc networks". International Journal Of Research And Reviews In Computer Science (Ijrrcs) Vol. 1, No. 4, December 2010 - [3] Weifa Liang, Senior Member, IEEE "Approximate minimum-energy multicasting in wireless ad hoc networks" IEEE Transactions On Mobile Computing, Vol. 5, No. 4, April 2006. - [4] Weifa Liang, Senior Member, IEEE, Richard Brent, Fellow, IEEE, Yinlong Xu, and Qingshan Wang "Minimum-energy all-to-all multicasting in wireless ad hoc networks" IEEE Transactions On Wireless Communications, Vol. 8, No. 10, October 2009. - [5] Yongwook Choi, Maleq Khan, Member, IEEE, V.S. Anil Kumar, and Gopal Pandurangan, Member, IEEE " Energy-optimal distributed algorithms for minimum spanning trees" IEEE Journal On Selected Areas In Communications, Vol. 27, No. 7, September 2009. - [6] Hassan Artail, Senior Member, IEEE, and Khaleel Mershad "MDPF: Minimum Distance Packet Forwarding for Search Applications in Mobile Ad Hoc Networks" IEEE Transactions On Mobile Computing, Vol. 8, No. 10, October 2009 - [7] Maleq Khan, Member, IEEE, Gopal Pandurangan, Member, IEEE, and V.S. Anil Kumar "Distributed Algorithms for Constructing Approximate Minimum Spanning Trees in Wireless Networks" IEEE Transactions On Parallel And Distributed Systems ## Performance Analysis between Routing Protocols for Mobile Adhoc Networks Sk. Daria Saheb<sup>1</sup>, P.Malyadri<sup>2</sup>, N. Srinivasulu<sup>3</sup> & Ch. Balaswamy<sup>4</sup> <sup>1&2</sup>ECE, Prakasam Engineering College, Kandukur,India <sup>3&4</sup>ECE, QISCET, Ongole,India Email: vali.dariya@gmail.com<sup>1</sup>, paduchurim@yahoo.com<sup>2</sup>, srivenkat.srivenkat@gmail.com<sup>3</sup>, ch.balaswamy7@gmail.com<sup>4</sup> Abstract - In modern days communication plays a very important role. In communication system network act as heart. By considering the wireless communication networks Adhoc networks plays dominant role. The main problem of Adhoc network is route failure. To improve the life time of network different routing protocols are consider. In present routing protocols of ad hoc networks, routing is an act of moving information from a source to destination in an internetwork. Route is selected in the route discovery phase until all the packets are sent out. Due to the continuous flow of packets in a selected route leads to the route failure. In order to reduce this problem we consider PRD-based MMBCR and considering the percentage of the optimum value for periodic route discovery. In our research we are going to analyze the performance of different routing protocols like DSR, MBCR and MMBCR to get maximum optimum value using Network Simulator Software. Keywords - Ad hoc network, Route discovery phase, optimum value. #### I. INTRODUCTION Communication is activity of conveying information from source to destination. The proper communication should be maintain with the help of communication channel, the communication channel classified into two categories like wired communication channel and wireless communication channel. In present technologies improve Bandwidth capability, higher frequency signals and consideration of the losses the wireless communication channel is efficient than the wired communication channel. To maintain proper communication communication network is needed. In communication process the network is called heart of communication system because in human body the heart is doing work circulation of blood to different organs as usually in communication process network can do passing the information from source to destination. Considering the different wireless communication networks like Infrared network, Bluetooth network, Wi-fi (wireless fidelity), Wimax (Wireless microwave access), Wireless sensor networks ,Adhoc networks(Infrastructure less network-Advanced Developers Hands on Conference). We are working in the field of Mobile Adhoc networks-Lifetime enhancement by considering different protocols like DSR (Dynamic Source Routing Protocol), MBCR (Minimum Battery Cost Routing), MMBCR (Minimum Maximum Battery Cost Routing), MMBCR-PRD (Minimum Maximum Battery Cost Routing-Periodic Route Discovery). The issues related to protocol layer in ad hoc networks is observed. The communication between the nodes in a packet data network must be defined to ensure correct interpretation of the packets by receiving intermediate and end systems. Packet exchange between the nodes is called protocols. The routing involves two things: Firstly determining optimal routing paths, secondly transferring the information groups (called packets) through an internet work. Routing protocols use several metrics to calculate the best path for routing the packets to its destination. Unsurprisingly, designing good protocols with few packets collision will reduce power consumption. At the network layer, the routing protocols can be designed such that there is an increase in the network life time by distributes the forwarding load over multiple different paths. The main objective of this paper to investigate the performance of routing in adhoc network. A group of mobile devices called as nodes, without any centralized network, communicates with each other over multi-hop links is called as an Ad-hoc Network (MANET). A MANET is a collection of self organized mobile users which are free to act independently that communicate over relatively bandwidth constrained wireless links. Since the nodes are mobile, the network topology may change quickly and cannot be predicted over time. Figure 2.3 shows the flowchart for working of general ad-hoc network. Fig. 1.1: Working of general ad-hoc network #### CLASSIFICATION OF ROUTING PROTOCOLS: Routing protocols for ad-hoc networks can generally be divided into three types. (i) Proactive Routing also known as table driven routing protocols (ii) Reactive routing also knows as on-demand routing protocols (iii) Hybrid routing protocols. Each and every node has limited life spam. To maximize the life time of nodes in a network, the energy consumption rate of each node must be evenly distributed. Section II describes the theoretical analysis, Section III analyses the existing energy efficient routing protocols, and Section IV presents the proposed mechanisms to increase the network lifetime. Section V describes about the experimental results and lastly section VI gives the conclusion. #### II. THEORITICAL ANALYSIS Performance analysis between routing protocols for mobile ad-hoc networks we are considering different routing protocols like DSR-Dynamic Source Routing, MBCR-Minimum Battery Cost Routing, MMBCR-Minimum Maximum Battery Cost Routing. The Dynamic Source Routing (DSR) protocol is a simple and efficient routing protocol designed specifically for use in multi-hop wireless ad-hoc networks of mobile nodes. Using DSR, the network is completely self-organizing and self-configuring, requires no existing network infrastructure or administration. Network nodes co-operate to forward packets for each other to allow communication over multiple "hops" between nodes which are out of wireless transmission range from one another. As nodes in the network move about or join or leave the network, all routing is automatically determined and maintained by the DSR routing protocol. Since the number or sequence of intermediate nodes needed to reach any destination may change at any time, the resulting network topology may be quite rich and rapidly changing. In DSR protocol overheads are very low and able to react very quickly to changes in the network. The DSR protocol provides highly reactive service in order to help ensure successful delivery of data packets in spite of node movement or other changes in network conditions. The DSR protocol is composed of two main mechanisms that work together to allow the discovery and maintenance of source routes in an ad-hoc network: **Route Discovery**: It is the mechanism by which a node S wishing to send a packet to a destination node D obtains a source route to D. Route Discovery is used only when S attempts to send a packet to D and does not already know a route to D. Route Maintenance: It is the mechanism by which node S is able to detect, while using a source route to D, if the network topology has changed such that it can no longer use its route to D because a link along the route no longer works. When Route Maintenance indicates a source route is broken, S can attempt to use any other route it happens to know to D, or it can invoke Route Discovery again to find a new route for subsequent packets to D. Route Maintenance for this route is used only when S is actually sending packets to D. In DSR, Route Discovery and Route Maintenance each operate entirely "on demand". In particular, unlike other protocols, DSR requires no periodic packets of any kind. For example, DSR does not use any periodic routing advertisement, link status sensing, or neighbor detection packets. This entirely on-demand behavior and lack of periodic activity allows the number of overhead packets caused by DSR to scale all the way down to zero, when all nodes are approximately stationary with respect to each other and all routes needed for current communication have already been discovered. As nodes begin to move more or as communication patterns change, the routing packet overhead of DSR automatically scales to only what is needed to track the routes currently in use. In response to a single Route Discovery, a node may learn and cache multiple routes to any destination. This support for multiple routes allows the reaction to routing changes to be much more rapid, since a node with multiple routes to a destination can try another cached route if the one it has been using should fail. This caching of multiple routes also avoids the overhead of needing to perform a new Route Discovery each time a route in use breaks. The sender of a packet selects and controls the route used for its own packets, which, together will support for multiple routes. #### 2.1. The minimum battery cost routing (MBCR): This protocol was proposed in which use remaining battery capacity of each host as a metric to describe the lifetime of each host. $$f_{i}(c_{i}^{t}) = \frac{1}{c_{i}^{t}}$$ (2.1.1) Where, $f_i(\boldsymbol{c}_i^t)$ is a battery cost function of a host $\mathbf{n}_i$ . Now, suppose a node's willingness to forward packets is a function of its remaining battery capacity. The less capacity it has, the more reluctant it is. As the battery capacity decreases, the value of cost function for node $n_i$ will increase. The battery cost $R_j$ for route i, consisting of D nodes, is $$R_{j} = \sum_{i=0}^{D_{j}^{-1}} f_{i}(c_{i}^{t})$$ (2.1.2) Therefore, to find a route with the maximum remaining battery capacity, we should select a route i that has the minimum battery cost. $$R_{j} = \min \left\{ R_{j} \middle| j \in A \right\} \tag{2.1.3}$$ Where, A is the set containing all possible routes. Advantage of MBCR: In MTPR, if the minimum total transmission power routes are via a specific host, the battery of this host will be exhausted quickly, and this host will die of battery exhaustion soon. Therefore, the remaining battery capacity of each host is a more accurate metric to describe the lifetime of each host. But, in MBCR since battery capacity is directly incorporated into the routing protocol, this metric prevents hosts from being overused, thereby increasing their lifetime and the time until the network is partitioned. If all nodes have similar battery capacity, this metric will select a shorter-hop route. **Disadvantage of MBCR**: Because only the summation of values of battery cost functions is considered, a route containing nodes with little remaining battery capacity may still be selected. For example, in Figure 2.12 there are two possible routes between the source and destination nodes. Although node 3 has much less battery capacity than other nodes, the overall battery cost for route 1 is less than route 2. Therefore, route 1 will be selected, reducing the lifetime of node 3, which is undesirable. #### 2.2 The min-max battery cost routing (MMBCR): This protocol was proposed in at first, in each possible route from source to destination, the maximum battery cost will be selected from Equation (2.2.1). Among this set of maximum battery costs, the minimum battery cost will be selected according to Equation (2.2.2). The battery of each host will be used more fairly than in previous schemes. Battery cost R<sub>i</sub> for route j is redefined as $$R_{j} = \max_{i \in route} f_{i}(c_{i}^{t})$$ (2.2.1) Similarly, the desired route i can be obtained from the equation $$R_i = \min \left\{ R_i \middle| j \in A \right\} \tag{2.2.2}$$ **Advantage**: Since this metric always tries to avoid the route with nodes having the least battery capacity among all nodes in all possible routes, the battery of each host will be used more fairly than in previous schemes. **Disadvantage:** The disadvantage is that since the minimum total transmission power is not considered in MMBCR, the power consumption may be more to transmit user traffic from a source to a destination, which actually reduces the lifetime of all nodes. In MMBCR (Min-Max Battery Cost Routing) we first find the node having minimum battery capacity in each node of the possible routes and select the route having the maximum value among the selected routes. That means the route having maximum life time is selected. But the main demerit of MMBCR is that it does not consider the transmission powers of the nodes. In MMBCR, the updated information is not considered for route selection. So, two mechanisms are proposed to overcome this disadvantage. The first is MMBCR-route reply, where the cost function is calculated in route reply phase instead of in route request phase for selecting the route. And the other is MMBCR with periodic route discovery to get more updated information about the routes. In this method periodically the route discovery process is done. If there are any changes in the route, the route information is updated. Because of this method, different routes are used for transmission of data packets and periodic shifting between the routes which avoids the over usage of nodes and node exhaustion leading to the increase of the life time of the network. #### 2.3 Performance Metrics: The following performance metrics are evaluated: **Packet delivery ratio**: The ratio of the data packets delivered to the destinations to those generated by the CBR sources. Received packets and sent packets number could be easily obtained from the first element of each line of the trace file. Average end-to-end delay: This includes all possible delays caused by buffering during route discovery latency, queuing at the interface queue, retransmission delays at the MAC, and propagation and transfer times. packet delivery ratio(%) =(received packets/sent packets)\*100 For each packet with id (Ii) of trace level (AGT) and type (cbr), we can calculate the send (s) time (t) and the receive (r) time (t) and average it. **Routing overhead:** It is the ratio of the routing packets sent and the total packets sent. Each hop-wise transmission of a routing packet is counted as one transmission. Calculation of the routing overhead: Routing overhead = routing packets sent / total packets sent #### 2.4 Experiment Environment: *Hardware*: Laptop: CPU Intel Celeron M processor 370, 256MB Memory. Operating System: Red hat 4, Windows XP **Network Simulator**: ns-2, version 2.30 with CMU MANET extension. Graph generator: gnu plot 4.20 #### III. EXISTING ROUTING MECHANISMS ROUTE SELECTION BY DSR, MBCR AND MMBCR #### 3.1 ROUTE SELECTION BY DSR Let us consider a 7-node network shown in figure 3.1 the route discovery process is started at 38 sec to find the route from node 1 to node 4. Since DSR does not consider the energies of the nodes and it only considers the minimum hop, the route 0-5-4 is selected and the data packets are moving from 0 to 5 and from 5 to 4 as shown in the figure 5.5. Even though, there is less energy in the node 5 shown by the red circle, the DSR does not consider it and it selected the route 0-5-4 since it is the shortest route with minimum hop. Fig. 3.1 A snapshot showing the route 0-5-4 is selected by DSR This is the disadvantage of the DSR. The route does not exists longer time, since the energy of the node 5 exhausts quickly. #### 3.2 ROUTE SELECTION BY MBCR Consider the same network as shown in fig 3.2, route discovery process starts at 38 sec by MBCR protocol. At this time, according to the trace file generated in the NS2 simulator for the TCL script 1, the energy levels of all the nodes are Node 0 - 1.288755, Node 1 - 1.237033, Node 2 - 1.239923, Node 3 -1.250182, Node 4 - 1.290121, Node 5 - 0.096358 and Node 6 - 1.288979. The corresponding cost functions are Node 0 - 0.776266, Node 1 - 0.808429, Node 2 -0.806545, Node 5 - 10.385178, Node 6 - 0.775945, and Node 3 - 0.799978. The total cost function along the route 0-2-3-4 is 2.382789, the total cost function along the route 0-5-4 is 11.16144, and the total cost function along the route 0-1-6-4 is 2.360640. Here, the route 0-1-6-4 has minimum total cost in the sense it has maximum battery capacity compared to the route 0-2-3-4. Since MBCR selects the route with minimum total cost (with maximum battery capacity), the route 0-1-6-4 is selected as shown in figure 3.2. Fig. 3.2 A snapshot showing the route 0-1-6-4 is selected by MBCR #### 3.3 ROUTE SELECTION BY MMBCR Consider the same network as shown in fig 3.3, route discovery process starts at 38 sec by MMBCR protocol. At this time, according to the trace file generated in the NS2 simulator for the same TCL script 1, the energy levels of all the nodes are Node 0 -1.288755, Node 1 - 1.237033, Node 2 - 1.239923, Node 3 - 1.250182, Node 4 - 1.290121, Node 5 - 0.096358and Node 6 - 1.288979. The corresponding cost functions are Node 0 - 0.776266, Node 1 - 0.808429, Node 2 - 0.806545, Node 5 - 10.385178, Node 6 -0.775945, and Node 3 - 0.800080. The MMBCR selects maximum battery cost (minimum battery capacity) in a route and stores. So, the maximum cost function in the route 0-2-3-4 is 0.806545, the maximum cost function in the route 0-5-4 is 10.385178, and the maximum cost function in the route 0-1-6-4 is 0.808429. Since MMBCR selects the route with minimum cost function (maximum battery capacity) stored among all routes, the route 0-2-3-4 is selected as shown in figure 3.3. The advantage of MMBCR is that it avoids the route which has a node with minimum battery capacity which leads to exhaust quickly. Fig. 3.3 A snapshot showing the route 0-2-3-4 is selected by MMBCR #### IV. PROPOSED ROUTING MECHANISMS In this mechanism we consider two things. They are 1) MMBCR-Route reply and 2) PRD-Based MMBCR. In, MMBCR the cost function is calculated and stored in route request packet header while going from source to destination. The decision of the route selection is made by the destination. Here the destination waits for some time to collect all the RREQ's.After making decision, it takes some time for RREP to reach the source. In between changes may occur in the energies of the network. The updated information is not considered in MMBCR for route selection. This can be overcome by calculating the cost function in RREP instead of in RREQ for route selection. Now, we are considering MMBCR-Route reply. Here the decision of route selection is made by the source node. The destination node simply replies to all the RREQs that reach it. Let us consider the source node which initializes the route request (RREO). In between intermediate nodes plays a prominent role for the route selection. The intermediate node simply forwards the RREO packet. And the destination node receives the RREQ packet to the corresponding route and responds immediately without any delay. The RREP and the intermediate node calculate their cost function and record the corresponding values in the RREP packet. And same method is followed and observed in the MMBCR of RREQ phase. The source node waits for some time and receives the entire RREP packet and selects the route with maximum life time and sends data packets through The main advantage in this method is the updated information about the nodes is known and the best route is selected for the packets to transfer. One more advantage of this mechanism is that the source node receives all the possible routes for the destination. It stores the values in the cache memory for the future usage. But this process is not observed in MMBCR where the source receives only one route from the destination. Now, considering PRD based-MMBCR. The problem observed in the existing MMBCR protocol is that once the route is selected in the route discovery phase. And the selected route is used until all the data packets are sent out or until the selected route fails due to the exhaustion of node's battery. If any node in the selected route with less energy is observed then that node will certainly die out causing route failure and hence the total network lifetime is failed. And the main point observed is the nodes in the selected route suffer lot continuously due to the packets coming from other nodes. To reduce this problem a mechanism of periodic route discovery process is introduced. In this mechanism the route discovery process is initialized periodically to increase the network life time. #### V. EXPERIMENTAL RESULTS At first, the algorithms are developed to implement the efficient protocol routing and proposed mechanisms by using NS-2 Network Simulator in the environment of Red Hat Linux-9. Fig 4.1: A snapshot showing the route discovery process of nodes from 0 to 6 Fig 4.2 A snapshot showing the PRD process from 0-4-5-6 nodes Fig 4.3. A snapshot showing the PRD process from nodes 0-1-2-3-6 Fig 4.4 . Node failure times for periodic route discovery MMBCR In fig 4.1.we can observe the route discovery process is done first before the packets sent from the source to the destination. Here we have the nodes from 0-1-2-3-4-5-6, where 0 is the source node and 6 is the destination. The selection of the route is done by considering the cost functions of each nodes and energy levels. The cost functions of Node0-0.768620, Node1-0.983654, Node2-0.807313, Node3-0.931826, Node4-1.611672, and Node5-0.820880. In fig4.2.we can observe the periodic selection of the route. Here the route is selected which has higher battery capacity. The routing is observed from 0-4-5-6 which has the cost function of 1.611672.And also we can observe that in node4 battery capacity is decreased and due to the continuous flow of packets from source to destination. Due to this it may leads to exhaustion and finally to node failure and network failure. Since the MMBCR protocol avoids a node with the maximum battery cost, the route 0-4-5-6 is avoided and the next route is selected periodically without any delay. In fig 4.3.we can observe the route selection is done from 0-1-2-3-6.MMBCR selects the route 0-1-2-3-6 which has higher battery capacity node. The cost function of this route is given as 0.983654. In this the selection of the route is done same as in the route 0-4-5-6.As the battery capacity of this route is high compared to the other route, this route is selected and will live longer in comparison with the other routes. And finally from fig 4.in the proposed PRD-based MMBCR protocol the route discovery period ' $\lambda$ ' is defined as the number of packets sent before the route discovery is reinitialized to find new route in which the battery energy is more in comparison with other possible routes. From the figure the graph between ' $\lambda$ ' and node failure time is taken. By observing the results, it is noticed for the value $\lambda$ equal to 10 where the discovery is reinitialized for every 10 packets. As the value of $\lambda$ is increased to 50 node failure time also increases. This is because of the over usage of the single route to forward more number of packets which in turn decreases the battery capacity. Fig. 4.5 Comparison of the route failure times of DSR, MBCR,MMBCR, PRD based - MMBCR #### VI. CONCLUSION In this paper, two mechanisms are proposed to increase the network life time. The first is MMBCR-Route reply, where the cost functions are calculated in route reply phase instead of in route request phase and the other is MMBCR with periodic route discovery. In this if there is any updated information about the routes can be modified. And different route are used for the transmission of the data packets and periodic shifting between the routes is observed which avoids the over usage of nodes and node exhaustion leading to the increase of the network life time. The simulation results show that the proposed mechanism PRD-based MMBCR performs better in case of node failure time and the optimum period is investigated for PRD based-MMBCR to get higher node failure time. #### REFERENCES [1] Mohammed Tarique, Kemal E. Tepe, and Mohammad Naserian, "Energy Saving Dynamic Source Routing for Ad Hoc Wireless Networks", Int. Proc. Of WIOPT, 2005. - [2] S. Singh and C.S. Raghavendra, "PAMAS-Power Aware Multi-Access Protocol with signaling for Ad Hoc Networks", ACM Common. Rev., July 1998. - [3] S. Singh, M. Woo, and C.S. Raghavendra, "Power Aware routing in Mobile Ad Hoc Networks, "Proc. Mobicom' 98, Dallas, TX, Oct 1998. - [4] C.K. Toh, "Maximum Battery Life Routing to support ubiquitous Mobile computing in Wireless Ad Hoc Networks", IEEE Communications Magazine, June 2001. - [5] W. Cho and S.L. Kim, "A fully distributed routing algorithm for maximizing lifetime of a wireless ad hoc network, "4th Int. Workshop on Mobile and Wireless Communications Network, 2002, Sep2002, pp. 670-674. - [6] The Network Simulator NS-2, http://www.isi.edu/nsnam/ns/ - [7] http://www.isi.edu/nsnam/ns/tutorial # **Enhancement of Stability of Power System with Distributed Static Series Compensator** #### Amaranatha Reddy. G & I. Mahesh Dept. of Electrical & Electronics Engineering, Madanapalle Institute of Technology & Science, Madanapalle, Andhra Pradesh, India. E-mail: amarnath.rainbow@gmail.com Abstract - Long distance AC transmission system is often subjected to stability problems which limit the transmission capability. Large power systems often suffer from weakly damped swings between synchronous generators. This paper aims to enhance the transient stability of the power system with the use of distributed static series compensator (DSSC). First of all, a detailed simulation model of the DSSC has been presented. DSSC has a function like static synchronous series compensator (SSSC) but is in smaller size and lower price along with more other capabilities. Likewise, DSSC lies in transmission lines in a distributed fashion. Flexible AC transmission systems (FACTS) devices can control power flow in the transmission system to improve asset utilization, relieve congestion, and limit loop flows. High costs and reliability concerns have restricted their use in these applications. The concept of distributed FACTS (D-FACTS) is introduced as a way to remove these barriers. A new device, the distributed static series compensator (DSSC), attaches directly to existing HV or EHV conductors and so does not require HV insulation. It can be manufactured at low cost from conventional industrial-grade components. The DSSC modules are distributed, a few per conductor mile, to achieve the desired power flow control functionality by effectively changing the line reactance. Experimental results from a prototype module are presented, along with examples of the benefits deriving from a system of DSSC devices.. **Keywords** - Distributed flexible ac transmission system (D-FACTS), series compensation, distributed static series compensator (DSSC), transient stability enhancement. #### I. INTRODUCTION A power system must be modeled as a nonlinear system for large disturbances. Although power system stability may be broadly defined according to operating conditions, the frequent considered one is the problem of transient stability. This sort of stability is mainly concerned with the maintenance of synchronism between generators following a sever disturbance [1]. development of power electronics has introduced the use of flexible alternative current transmission system (FACTS) controllers in power systems. FACTS controllers are pretty flexible and provide the ability of fast controlling of the network conditions. This salient feature of FACTS can be exploited to improve the stability of a power system [2]-[8]. Along the admissible and marvelous merits, Some of the main drawbacks regarding the FACTS technology are as follows. - Device complexity and more component requirements lead to a high cost installation; - Single point of failure will bring about the entire system to shut down; • Lumped nature of system and initial over-rating of devices to furnish the future growth provides poor return on investment (ROI); and etc. Recently a new concept from the family of distributed FACTS (D-FACTS) has been introduced as a way to overcome the most of serious limitations of FACTS devices. D-FACTS devices offer all capabilities of their FACTS counterparts. This newly born technology points the way to a novel approach for achieving power flow control. The distributed nature of the suggested system makes it possible to achieve fine granularity in the system rating. Furthermore, it is possible to expand the system with the growing demand [9]. The concept of DSSC is on the base of utilizing a low-power single-phase inverter, which attaches to the transmission conductor and dynamically controls the corresponding transfer impedance. By this way, the active control of power flow on the line is achieved [10]. Quite few papers have attempted the modeling and interrogating of DSSC's capabilities. For instance, [11] offers a graphical simulation model for DSSC and explores a single phase system which comprises only one DSSC and an ideal voltage source instead of generators. Hence, the least available and reported technical papers for DSSC justifies further studies on the other capabilities of this device. This study serves a research where 1400 DSSCs are integrating in a two-area, two-machine system in order to examine the transient stability of the system. #### II. DSSC BASIC CONCEPT DSSC concept has been originated based on FACTS devices, which is in fact a model of a SSSC but in a smaller size, at a lower price, and with a higher capability. The distributed fashion of the DSSC contributes more safety and improved controllability of power system. Fig. 1 displays an imaginary schematic of DSSC exploited in a power line so as to control the power flow by changing the line impedance. Each DSSC module is rated at about 10 KVA and is clamped around the line. The individually controlling of each module provides an opportunity to increase or decrease the impedance of the line or to leave it unaltered. With a large number of modules performing together, it will be feasible to yield substantial influence on the overall power flow in the line [12]. Fig. 1 D-FACTS deployed on power line The low VA ratings of the modules emanates from mass manufactured power electronics systems in the industrial drives and UPS markets, which offers the chance to actualize extremely low cost implementation. On the other hand, utilizing a large number of modules results in a high system reliability, as the system operation is not much affected by the failure of a small number of modules [9]. A DSSC module As illustrated in Fig. 2, is composed of a small rated (10 KVA) single phase inverter and a single turn transformer (STT) with its associated controls, power supply circuits and built-in communications capability [13]. The STT is a critical component of the DSSC. It makes the use of the transmission conductor as a secondary winding and is designed with high turn ratio which reduces the current handled by the inverter; hence it will be possible to use commercial IGBTs to realize lower cost [10]. The transformer core is made up of two parts that can be physically clamped around the transmission line to constitute a complete magnetic circuit [14]. Fig. 2 Circuit schematic of a DSSC module #### III. DSSC IMPACT ON POWER FLOW As mentioned earlier, DSSC is connected in series to the transmission line and thus has the ability of injecting a synchronous fundamental voltage that is in quadrature with the line current directly into the transmission conductor. As a result, the transmitted power becomes a parametric function of the injected voltage and can be stated as the sequel: $$P_{12} = \frac{V_1 \ V_2}{X_L} \sin \delta - \frac{V_1 \ V_q}{X_L} \cos \left(\frac{\delta}{2}\right) \left| \frac{\sin \left(\frac{\delta}{2}\right)}{\sqrt{\left(\frac{V_1 + V_2}{2 \ V_2}\right)^2 - \frac{V_1}{V_2} \cos \left(\frac{\delta}{2}\right)}} \right|$$ (1) where: V1 and V2 = the bus voltage magnitudes; Vq = the series injected voltage magnitude; = the voltage phase difference; and XL = the impedance of the line, assumed to be purely inductive. The DSSC can simply increase the transmittable power as well as decrease it by reversing the polarity of the injected ac voltage [15]. This is worth noting that this feature is responsible for corroborating the DSSC salient ability for power flow control in the overall system. The variation of the transmitted power verses load angle with different quadrate voltage injections, for equal bus voltage magnitudes is depicted in Fig. 3. ### IV. SIMULATION MODEL EXTRACTED FOR DSSC This section reviews the graphical-based simulation model for the DSSC introduced in [11]. This model provides comprehensive understanding of the operational principles of the DSSC; and hence can be very appropriate for extra operational analysis of this device. #### A. Single Phase Inverter Structure As displayed in Fig. 4, the DSSC power circuit includes the inverter, filter circuit, breaker, and transformer. As shown, the DSSC single phase inverter consists of four IGBT devices in a full bridge configuration. The dc link is realized with a fixed capacitor. Also an output LC filter (Lf and Cf) is expected in the output of the inverter to alleviate the harmonic pollution of the injected voltage. Fig. 4 DSSC power circuit Sinusoidal pulse width modulation (SPWM) technique is well known to offer simplicity and good response for inverter switching strategy. On account of this reason, SPWM is the case which is speculated here. #### B. Control Strategy The fundamental task of the DSSC is to control the power flow in a transmission line. This goal can be obtained either by direct control in which both the angular position and the magnitude of the output voltage are controlled, or by indirect control in which only the angular position of the output voltage is to be controlled and the magnitude remains proportional to the dc terminal voltage [16]. The inverters which are directly controlled impose more difficulty and higher cost to be implemented compared to indirectly controlled inverters, also their function is typically correlated with some penalty in terms of increased losses, greater circuit complexity and increased harmonic components in the output. As a consequence, the control scheme used for the DSSC model investigated in this paper is based on indirect control technique [11]. Fig. 5 exhibits the DSSC control system and SPWM generator. The controller main objective is to hold the charge constant on the dc capacitor and also to inject a voltage that is in quadrature with the line current. A small phase displacement namely, error, beyond the required 90° between the injected voltage and the line current is needed to fix the dc capacitor voltage. The signal obtained by comparing Vdc with Vdc(ref) is passed through a proportional-integral (PI) controller which generates the required phase angle displacement or error. The Phase-Locked Loop (PLL) provides the basic synchronization signal, , which is the phase angle of the line current [11]. Fig. 5: DSSC control system and SPWM generator ### V. TRANSIENT STABILITY ENHANCEMENT WITH DSSC DSSC would enhance the transient stability by partial eliminating of the series impedance of the transmission line. The transient stability however, can be more increased by temporarily changing the compensation with a supplementary controller combined to the main control loop of DSSCs. For the duration of the first acceleration period of the machine, the controller increases the transmitted power by injecting higher series voltage. Similarly, the deceleration of the machine is increased simply by increasing the line impedance and thus, decreasing the transmitted power. Fig. 6 shows the power system considered as the case study in the following simulations. With respect to Fig. 6, it can be observed that the load center is modeled by a 5000 MW resistive load. The load is fed by a local generation of 4000 MW (machine G2) and a remote 1000 MW plant (machine G1) which is connected to the load center through a long 500 kV, 700 km transmission line. The system has been initialized so that the line transmits 950 MW which is close to its surge impedance loading (SIL=977 MW). Vdc(ref) for each DSSC module is fixed at 2 kV, amplitude modulation ratio is set at 0.5, and the turns ratio of STT is 1:100. Consequently, by applying these adjustments, the injected voltage of each DSSC module is anticipated to reach a peak to peak value of 10 V. Regarding that the injected voltage of each DSSC is 10 V, with the view of achieving %4 compensation on transmission line, near 1400 DSSC modules are required in each phase of the line. Fig. 6 provides a good illustration of DSSCs placement in the transmission lines. $$X_L = 230\Omega, X_{inj} = -9.5\Omega$$ $\frac{X_{inj}}{X_L} \times 100 = \% Compensation$ (2) The negative sign for $\chi_{inj}$ denotes the capacitive mode of DSSCs to series compensation of the line. In order to evaluate the DSSC impact in the transient stability enhancement, three different case studies are considered. The next section would present the complete simulation results for these states. Fig. 6 Simulation model of two-machine power system for transient stability study with DSSCs #### VI. SIMULATION RESULTS This section is dedicated to scrutinize the DSSC influence in the overall system performance under three different cases. The cases considered here are as follows. - Impact of DSSC on steady state operation point; - Three phase fault impact of DSSC without damping controller; and - Three phase fault impact of DSSC equipped with damping controller. In the subsequent sections, simulation results are obtained for each situation individually and a complete discussion is presented #### A. Impact of DSSC on Steady State Operation Point First of all, 1400 DSSCs are considered in the line per phase for achieving %4 compensation. Fig. 7 demonstrates that when the DSSCs are out of service, the rotor angle difference, *d\_theta1\_2*, between the two machines is about 53 degree. Typically the line power is assumed to be constant; thus by entering the DSSCs to the power system at t = 4 sec, the series impedance of the line will decrease. As a result, with respect to (3), the rotor angle difference $d\_theta1\_2$ is decreased to 50.25 degree. Fig. 7 The rotor angle difference (d\_theta1\_2) response when the DSSCs are put in service at t=4 For this reason the transient stability margin of the system is improved with compensation. $$P_{I2} = \frac{V_I V_2}{X_L} \sin \delta \tag{3}$$ ### B. Three Phase Fault\_Impact of DSSC without Damping Controller Here, the DSSCs are initially placed in the circuit, but there is no damping controller on the main control loop, namely Auxiliary Damping Signal is set to zero. The bus1 near the machine G1 is subjected to a three phase to ground fault with duration of 0.085 second. Figs. 8 and 9 are obtained for this case. It can be seen that, when the DSSCs are out of service, the rotor angle between the machines is increased rapidly and two machines fall out of synchronism after fault clearing. In contrast, when the DSSCs are in circuit, for the same fault circumstance the system remains stable. Fig. 8: The rotor angle difference (d\_theta1\_2) variation after the fault without DSSCs and with DSSCs (without damping controller) ### C. Three Phase Fault\_Impact of DSSC with Damping Controller To be more precise, the DSSC by itself does not provide the essential damping of oscillations as its primary duty is to control the line power flow. With the purpose of achieving better damping over a wide range of operation, a power oscillation damping (POD) controller is added to the main control loop of DSSCs. Fig. 10 shows the POD controller structure. This figure displays that the damping controller is composed of a gain block, a washout filter, and a leadlag compensator. The damping controller is designed so as to provide an extra electrical torque in phase with the speed deviation in order to enhance the damping of oscillations [1]. The gain setting of the damping controller is adopted so as to achieve the desired damping ratio of the electromechanical fluctuations. The purpose of the washout circuit is to block the auxiliary controller from responding to the steady-state power conditions. The parameters of the lead-lag compensator are adjusted so that the phase shift between the speed deviation and the resulting electrical torque at the desired frequency is compensated. In the following, an additional electrical damping torque output is acquired in phase with the speed deviation. Here, the parameters of the controller are determined through the simulation studies by a trial-error method with the aim of achieving the best damping. The selection of an appropriate input signal is a fundamental issue in the design of an effective and robust auxiliary damping controller. In this paper, as depicted in Fig. 10, the generator rotor speed is considered as the input signal. The output of the auxiliary damping controller is used to modulate the reference setting of DSSC in order to provide the excellent damping [16]. For the work at hand, as illustrated earlier in Fig. 5, the output of the POD controller is utilized to regulate the magnitude of the series injected voltage during electromechanical transients to yield the proper damping of oscillations. Now the system is subjected to a severe fault with duration of 0.1 sec which is applied again near the bus1. Simulation results are presented in Figs. 11 and 12. These figures address two different cases namely, DSSC without any supplementary damping controller and DSSC which is supplied with POD controller. As it can be seen from Figs. 11 and 12, for the case where the DSSC lacks a power fluctuations damping controller, the system is completely unstable and two machines fall out of synchronism quickly. Also it can be noticed that when the DSSC control loop includes a power oscillation mitigating controller, the system is kept stable. Fig. 13 shows the POD controller output signal generated for damping the system oscillations. Fig. 11 (a) the rotor angle difference (d\_theta1\_2) variation (b) variation of the machines angular speed after the fault with POD controller and without POD controller Fig. 12 Machines voltage variation after the fault with POD controller and without POD controller Fig. 13 Auxiliary Damping Signal generated with POD controller #### VII.CONCLUSION The development of D-FACTS devices has been announced as an effective approach to overwhelm the high cost implementation of FACTS family. The distributed devices are also apt to accomplish some other ancillary duties such as transient stability enhancement, power oscillation damping, etc. This study served a review of graphical-based simulation model for the DSSC which is in fact a smaller counterpart of SSSC. A two-machine power system is put under investigation in order to verify the DSSC capability for increasing the transient stability of the whole system. Simulation results demonstrate that when the DSSCs are out of service, the rotor angle between the machines, d thetal 2, is increased rapidly and two machines fall out of synchronism after fault clearing. But when the DSSCs are in circuit, they stabilize the system even without a specific controller. In the next, a severe fault is taken to occur in the system. It is shown that for this case, the system even with DSSCs in service becomes totally unstable. Hence, a POD controller is added to the main control loop of DSSC for improving the transient stability margin of the system. Simulation results exhibit that in this case the system will remain stable after the fault removal. Consequently, the ability of distributed devices such as DSSC in the enhancement of the power system operation is certified. #### REFERENCES - [1] S.Golshannavaz, M.Mokhtari, M.Khalilian, D.Nazarpour "Transient Stability Enhancement in Power System with Distributed Static Series Compensator (DSSC)"Electrical Engineering (ICEE), 2011 19th Iranian Conference on 17-19 May 2011 - [2] P. Kundur, Power system stability and control, Prentice-Hall, N. Y. U. S. A, 1994. - [3] R.M. Mathur and R.K. Varma, Thyristor-Based FACTS Con trollers for Electrical Transmission Systems, IEEE Press and Wiley Interscience, New York, USA, Feb. 2002. - [4] M. Bongiorno, J. Svensson, and L. Angquist, "On control of static synchronous series compensator for SSR mitigation," IEEE Trans. Power Electron., vol. 23, no. 2, pp. 735–743, Mar. 2008. - [5] A.H.M.A Rahim and M.F. Kandlawala, "Robust STATCOM voltage controller design using loop shaping technique," Electric Power System Research, 68, 2004, pp.61-74. - [6] J. G. Singh, S. N. Singh, and S. C. Srivastava, "Enhancement of power system security through optimal placement of TCSC and UPFC," in Proc. IEEE PES General Meeting, pp. 1–6, Jun, 2007. - [7] M. S. El-Moursi, A. M. Sharaf, and K. El-Arroudi, "Optimal control schemes for SSSC for dynamic series compensation," Elect. Power Syst. Res., vol. 78, no. 4, pp. 646–656, 2008. - [8] R. Majumder, B. C. Pal, C. Dufour and P. Korba, "Design and realtime implementation of robust FACTS controller for damping interarea oscillation," IEEE Trans. Power Syst, vol. 21, no. 2, pp.809–816, 2006. - [9] Divan, D. et al., "A distributed static series compensator system for <u>realizing</u> active power flow control on existing power lines" IEEE Trans. Power Delivery, vol. 22: pp. 642-649, 2006. - [10] D. M. Divan, W. Brumsickle and R. Schneider, "Distributed Floating Series Active Impedance for Power Transmission Systems," U.S. Patent Application # 10/679.966. - [11] P. Fajri, D. Nazarpour and S. Afsharnia, "A PSCAD/EMTDC Model for Distributed Static Series Compensator (DSSC)," Electrical Engineering, ICEE 2008. Second International Conference on Publication, March 2008. - [12] Deepak Divan, "Design consideration for series connected distributed facts converter," IEEE Transmission and Distribution Conference, New Orleans, <u>Louisiana</u>, 2005. - [13] Deepak Divan, "Distributed Intelligent Power Networks-A New Concept for Improving T&D System Utilization and Performance," IEEE Transmission and Distribution Conference, New Orleans, Louisiana, 2005. - [14] Mark Rauls, "Analysis and Design of High Frequency Co-Axial winding Transformers," MS Thesis, University of Wisconsin Madison. US, 1992. - [15] L. Gyugyi, C.D. Schauder and K.K. Sen, "Static Synchronous Series Compensator: a Solid-State Approach to the Series Compensation of Transmission Lines," IEEE Trans. Power Delivery, vol. 12, No. 1, Jan. 1997, pp.406-417. - [16] N.G. Hingorani and L. Gyugyi, Understanding FACTS: Concepts and technology of flexible ac transmission system. IEEE Press, NY, 2000. # Active Power Filter with advanced Current Controller Technique for Power Quality Improvement #### Dipti A. Tamboli & D. R. Patil Department of Electrical Engineering, Walchand College of Engineering, Maharashtra, India 416415 E-mail: dipti\_tamboli@rediffmail.com, dadasorpatil@gmail.com Abstract - Power Quality issues are becoming a major concern for today's power system engineers. Large scale incorporation of non-linear loads has the potential to raise harmonic voltages and currents in an electrical distribution system to unacceptable high levels that can adversely affect the system. Active power filter (APF) based on power electronic technology is currently considered as the most competitive equipment for mitigation of harmonics and reactive power simultaneously. Instantaneous power theory is used for generation of reference current. This paper presents a comparative study of the performance of three current control strategies namely ramp comparison method, hysteresis current controller (HCC) and Adaptive hysteresis current controller(AHCC) and superiority of AHCC is established. Simulation results for all the method are presented using MATLAB/SIMULINK power system toolbox demonstrating the effectiveness of using adaptive hysteresis band. Keywords - Active power filter, Harmonics, Instantaneous power theory, hysteresis current control, AHCC. #### I. INTRODUCTION SOLID-STATE control of ac power using thyristors and other semiconductor switches is widely employed to feed controlled electric power to electrical loads, such as adjustable speed drives (ASD's), furnaces, computer power supplies, as well as in HVDC systems and renewable electrical power generation. Since, being non linear loads, these solid-state converters draw harmonic and reactive power components of current from ac mains. These injected harmonics, reactive power burden, unbalance, and excessive neutral currents cause low system efficiency and poor power factor. They also cause disturbance to other consumers and interference in nearby communication networks. Extensive surveys [1, 2] have been carried out to quantify the problems associated with electric power networks having nonlinear loads. Conventionally passive L–C filters have limitations of fixed compensation, large size, and resonance. The increased severity of harmonic pollution in power networks has attracted the attention of power electronics and power system engineers to develop dynamic and adjustable solutions to the power quality problems [3]. Such equipment, generally known as active power filters (APF's), are also called active power line conditioners (APLC's). Shunt active filters theories and applications have become more and more popular and have attracted much attention. APF's adopt intelligent circuits to measure harmonic and reactive power of nonlinear loads and take corrective actions. By injecting compensation current with 180° phase shift with load current at the point of common coupling where the nonlinear load is connected, the sinusoidal source currents are maintained. Therefore, both harmonic suppression and reactive power compensation for the nonlinear load are achieved. Fig. 1: APF control block One of the peculiar features of shunt APFs is that it does not require energy storage units such as batteries or active sources in other forms for its compensation mechanism. To accomplish this function, it requires an effective reference compensation strategy for both reactive and harmonic power of the load. Generally, the performance of APF is based on three design criteria [4-12]: (i) design of power inverter; (ii) types of current controllers used; (iii) methods used to obtain the reference current. Many control techniques have been used to obtain the reference current [6-12]. Among these controllers the instantaneous real-power theory provides good compensation characteristics in steady state as well as transient states [4-5]. The instantaneous real-power theory generates the reference currents required to compensate the load current harmonics and reactive power. It also tries to maintain the dc-voltage across the capacitor of the inverter constant. Another important characteristic of this real-power theory is the simplicity of calculations, which involves only algebraic calculation. Fig. 1 shows a system control block for a three-phase shunt APF, where its reference compensation currents are computed according to the input voltages and load currents. Similarly various current controller techniques proposed for APF configuration, such as triangularcurrent controller, sinusoidal PWM, periodical-sampling controller and hysteresis current controller. In ramp comparison method multiple crossings of the ramp by the current error may become a problem when the time rate change of the current error becomes greater than that of the ramp. Therefore nowadays, hysteresis current controller method attracts researcher's attention due to unconditional stability, fast transient response, simple implementation and high accuracy. However, this control scheme exhibits several unsatisfactory features such as uneven switching frequency and switching frequency variation within a particular band. The adaptive-hysteresis current controller overcomes these demerits of HCC; adaptive- HCC changes the bandwidth according to instantaneous compensation current variation. The adaptive-HCC changes the bandwidth according to instantaneous compensation current variation that is used to optimize the required switching frequency. This paper presents design and analysis of an active power filter that uses instantaneous power-theory with three types of current controller. This current controller generates switching pulses for the active power inverter. The shunt APF is investigated under non-linear load and found to be effective for harmonics and reactive power compensation. The simulation is carried out using MATLAB/Simulink for two different types of nonlinear loads and at different firing angles. This entire simulation studies were carried out by choosing an 11Kv feeder providing supply to Walchand College of Engineering, Sangli, India. ### II. SYSTEM CONFIGURATION AND COMPENSATING PRINCIPLE #### A. Sysem Configuration In an APF depicted in Fig.1, a current controlled voltage source inverter is used to generate the compensating current (i<sub>c</sub>) and is injected into the utility power source grid. It cancels the harmonic components drawn by the nonlinear load and keeps the utility line current sinusoidal. A voltage-source inverter having IGBT switches and an energy storage capacitor on dc bus is implemented as a shunt APF. The main aim of the APF is to compensate harmonics for power quality improvement and reactive power so as to improve power factor. The APF system consists of a three phase voltage inverter with the current regulation, which is used to inject the compensating current into the power line. The VSI contains a three-phase isolated gate bipolar transistor (IGBT) with anti-paralleling diodes. The VSI is connected in Parallel with the three-phase supply through three inductors $L_{fl}$ , $L_{f2}$ and $L_{f3}$ . The DC side of the VSI is connected to a DC capacitor C that carries the input ripple current of the inverter and the main reactive energy storage element. The DC capacitor provides a constant DC voltage and the real power necessary to cover the losses of the system. The inductors L<sub>f1</sub>, L<sub>f2</sub> and L<sub>f3</sub> perform the voltage boost operation in conjunction with the capacitor, and at the same time act as the low pass filter for the AC source current. Then the APF must be controlled to produce the compensating currents $i_{c1}$ , $i_{c2}$ and $i_{c3}$ following the reference currents $i_{c1}^{*}$ , $i_{c2}^{*}$ and $i_{c3}^{*}$ through the control #### B. Instanstaneous reactive power theory (p-q theory) In 3-phase circuits with balanced voltage, instantaneous currents and voltages are converted into instantaneous space vectors [13, 14]. The traditional definitions of the power components are all based on the direct quantities of 3-phase voltages and currents vectors: $e_a$ , $e_b$ , $e_c$ and $i_a$ , $i_b$ , $i_c$ . In instantaneous reactive power theory, the instantaneous 3-phase currents and voltages are expressed as the following equations. These space vectors are easily converted into $\alpha$ - $\beta$ coordinates. $$\begin{bmatrix} e_{\alpha} \\ e_{\beta} \end{bmatrix} = C_{32} \begin{bmatrix} e_{a} \\ e_{b} \\ e_{c} \end{bmatrix}$$ (1) $$\begin{bmatrix} i_{\alpha} \\ i_{\beta} \end{bmatrix} = C_{32} \begin{vmatrix} i_{a} \\ i_{b} \\ i_{c} \end{vmatrix}$$ (2) Where $$C_{32} = \sqrt{\frac{2}{3}} \begin{bmatrix} 1 & -1/2 & -1/2 \\ 0 & \sqrt{3}/2 & -\sqrt{3}/2 \end{bmatrix}$$ And $\alpha$ , $\beta$ are orthogonal coordinates. $e_{\alpha}$ and $i_{\alpha}$ are on $\alpha$ axis, $e_{\beta}$ and $i_{\beta}$ are on $\beta$ axis. When the source supplies nonlinear loads, the instantaneous power delivered to the loads includes both active and reactive components. So, the current vector i was divided into active current component and reactive current component, which are $i_p$ and $i_q$ respectively, as shown in Fig.2 Fig.2 Vector diagram of voltage and currents In the representation of electric quantities, instantaneous active and reactive powers are calculated as follows: $$p=ei_p, q=ei_q,$$ Where, $i_p = i\cos\varphi$ , $i_q = i\sin\varphi$ make up Eq. (3): $$\begin{bmatrix} \mathbf{p} \\ \mathbf{q} \end{bmatrix} = \begin{bmatrix} \mathbf{e}_{\mathbf{p}+\mathbf{p}'} \\ -\mathbf{q}+\mathbf{q}' \end{bmatrix} = \begin{bmatrix} \mathbf{e}_{\alpha} & \mathbf{e}_{\beta} \\ -\mathbf{e}_{\beta} & \mathbf{e}_{\alpha} \end{bmatrix} \begin{bmatrix} \mathbf{i}_{\alpha} \\ \mathbf{i}_{\beta} \end{bmatrix}$$ (3) Here, "-" and " $\sim$ " stand for dc and ac components, respectively. $\overline{\mathbb{F}}$ and are the instantaneous active and reactive power (dc value) originating from the symmetrical fundamental (positive-sequence) component of the load current, $\tilde{P}$ and $\tilde{Q}$ are the instantaneous active and reactive power (ac value) originating from harmonic and the asymmetrical fundamental (negative-sequence) component of the load current. These power quantities given above for an electrical system are represented in a-b-c coordinates and have the following physical meaning: F=The mean value of the instantaneous active power— corresponds to the energy per time unit transferred from the power supply to the load, through a-b-c coordinates, in a balanced way. F =Alternated value of the instantaneous active power it is the energy per time unit that is exchanged between the power supply and the load through a-b-c coordinates. **q**=Instantaneous reactive power—corresponds to the power that is exchanged between the phases of the load. This component does not imply any exchange of energy between the power supply and the load, but is responsible for the existence of undesirable currents, which circulate between the system phases. Figure 1. The mean value of the instantaneous reactive power that is equal to the conventional reactive power. From Eq.(3), in order to measure the harmonic currents and reactive current component, fundamental active current corresponding to reactive power on $\alpha$ - $\beta$ coordinates should be first calculated by Eq.(4): $$\begin{bmatrix} \mathbf{i}_{\alpha f} \\ \mathbf{i}_{\beta f} \end{bmatrix} = \mathbf{C}_{pq}^{-1} \begin{bmatrix} \overline{p} + p_{loss} \\ 0 \end{bmatrix}$$ (4) The fundamental active currents in $\alpha$ - $\beta$ reference frame are then transformed into a-b-c reference frame and they are: $$\begin{bmatrix} I_{af} \\ I_{bf} \\ I_{cf} \end{bmatrix} = \sqrt{2/3} \begin{bmatrix} 1 & 0 \\ -1/2 & \sqrt{3/2} \\ -1/2 & -\sqrt{3/2} \end{bmatrix} \begin{bmatrix} i_{\alpha f} \\ i_{\beta f} \end{bmatrix}$$ (5) Finally, the reference compensation currents are obtained by Eq. (6): $$\begin{bmatrix} \vec{i}_a \star \\ \vec{i}_b \star \\ \vec{i}_c \end{bmatrix} = \begin{bmatrix} \vec{i}_a \\ \vec{i}_b \\ \vec{i}_c \end{bmatrix} - \begin{bmatrix} \vec{i}_{af} \\ \vec{i}_{bf} \\ \vec{i}_{cf} \end{bmatrix}$$ (6) The DC side voltage of APF should be controlled and kept at a constant value to maintain the normal operation of the inverter. Because there is energy loss due to conduction and switching power losses associated with the diodes and IGBTs of the inverter in APF, which tend to reduce the value of $V_{\rm dc}$ across capacitor $C_{\rm dc}$ . A feedback voltage control circuit needs to be incorporated into the inverter for this reason. The difference between the reference value, $V_{\rm ref}$ and the feedback value ( $V_{\rm dc}$ ), an error function first passes a PI regulator and the output of the PI regulator is added in the alpha axis value of the fundamental current components. #### III. CURRENT CONTROLLER TECHNIQUE #### A. Ramp comparison controller The controller can be thought of as producing sine-triangle PWM with the current error considered to be the modulating function. The current error is compared to a triangle waveform as shown in fig.3 and if the current error is greater(less) than the triangle waveform, and then the inverter leg is switched in the positive (negative) direction. With sine-triangle PWM, the inverter switches at the frequency of the triangle wave and produces well defined harmonics. Multiple crossings of the ramp by the current error may become a problem when the time rate change of the current error becomes greater than that of the ramp. However, such problems can be adjusted by changing the amplitude of the triangle wave suitably. Fig. 3: Ramp comparison controller #### B. Hysteresis controller [15] The hysteresis band current control can be implemented to generate the switching pattern in order to get precise and quick response. The hysteresis band current control technique has proven to be most suitable for all the applications of current controlled voltage source inverters in active power filters. The hysteresis band current control is characterized by unconditioned stability, very fast response, good accuracy, and inherent-peak current limiting capability, the technique does not need any information about system parameters. The conventional hysteresis band current control scheme used for the control of active power filter line current is shown in Fig.4, composed of a hysteresis around the reference line current [15]. The reference line current of the active power filter is referred to as ${\rm I_{ca}}^*$ , and actual line current of the active power filter is referred to as ${\rm I_{ca}}$ . The hysteresis band current controller decides the switching pattern of active power filter. The switching logic is formulated as follows: If ${\rm i_{ca}} < ({\rm I_{ca}}^* - {\rm HB})$ upper switch is OFF and lower switch is ON for leg "a" :( SA=l). If $i_{ca} > (I_{ca}^* + HB)$ upper switch is ON and lower switch is OFF for leg "a" (SA=O). The switching functions SB and SC for phases B and C are determined similarly, using corresponding reference and measured currents and hysteresis bandwidth (HB). The switching frequency of the hysteresis band current control method described above depends on how fast the current changes from the upper limit of the hysteresis band to the lower limit of the hysteresis band, or vice versa. Fig. 4 :Simulation diagram of hysteresis current controller #### C. Adaptive hysteresis current controller Width of the hysteresis band determines the allowable current shaping error to control the switching frequency of the inverter. As the bandwidth narrows the switching frequency increases. A suitable bandwidth should be selected in accordance with the switching capability of the inverter. The bandwidth should also be small enough to supply the reference current precisely keeping the view of switching losses and EMI related problems. Therefore, the range of switching frequencies used is based on a compromise between these factors. By changing the bandwidth, which changes the average switching frequency of the APF, user can evaluate the performance for different value of hysteresis bandwidth. Switching frequency of the hysteresis band current controller is depends on the rate of change of the actual APF current and therefore switching frequency varies along with the current waveform. Line inductance of the APF and the dc bus voltage are the main parameters determining the rate of change of the actual APF currents. Therefore switching frequency also depends on these two parameters. Fig. 5: Current and Voltage waves with AHCC Fig. 5 shows the PWM current and voltage waves for phase a [16]. When the actual line current of the active power filter tries to leave the hysteresis band, the suitable IGBT is switched to ON or OFF to force the current to return to a value within the hysteresis band. Then the switching pattern will be trying to maintain the current inside the hysteresis band. Current i<sub>ca</sub> tends to cross the lower hysteresis band at point 1, where upper side IGBT of leg 'a' is switched on. The linearly rising current i<sup>+</sup><sub>ca</sub> then touches the upper band at point 2, where the lower side IGBT of leg 'a' is switched on. The following equations can be written in the respective switching intervals t1 and t2. $$L\frac{dV_{a}^{2}}{dt} = (0.5 V_{do} - V_{a}) \tag{7}$$ $$L\frac{dt_2}{dt} = -(0.5 V_{de} + V_2) \tag{8}$$ $$\frac{d\hat{a}}{dt} + \frac{d\hat{a}}{dt} = 0 \tag{9}$$ where L is phase inductance, and i<sup>+</sup><sub>ca</sub> and i<sup>-</sup><sub>ca</sub> are the respective rising and falling current segments. From the geometry of Fig. 5, we can write $$\frac{dG}{dt}t_1 + \frac{dG}{dt}t_1 = 2 * HB \qquad (10)$$ $$\frac{d\vec{t}}{dt}t_{0} - \frac{d\vec{t}}{dt}t_{0} = -2 * HB \tag{11}$$ $$t_1 + t_2 = T_c = \frac{1}{f_c} \tag{12}$$ where t1 and t2 are the switching intervals, and fc is the switching frequency. Adding (10) and (11) and substituting in (12), we get $$t_4 \frac{di\overline{s}}{dt} + t_8 \frac{di\overline{s}}{dt} - \frac{1}{f_8} \frac{di\overline{s}_8}{dt} = 0 \tag{13}$$ Subtracting (11) from (10), we get $$4HB = \frac{dt_0^2}{dt}t_1 - \frac{dt_0^2}{dt}t_2 - (t_1 - t_2)\frac{dt_2^2}{dt}$$ (14) Substituting (9) in (13) and (14) and simplifying $$4HB = (t_1 + t_2) \frac{dt_2^2}{dt} - (t_1 - t_2) \frac{dt_{22}^2}{dt} = 0$$ (15) $$(t_1 \quad t_2) = (\frac{dt_{eq}^2}{dt})/f_0(\frac{dt_q^2}{dt})$$ (16) Substituting (16) in (15), gives $$HB = \left\{ \frac{0.423 V_{de}}{f_{c} L} \left[ 1 - \frac{4L^{2}}{V_{c}^{2}} \left( \frac{V_{c}}{L} + m \right)^{2} \right] \right\}$$ (17) where m is the slope of command current wave. Hysteresis band (HB) can be modulated at different points of fundamental frequency to control the switching patterns of the inverter. For symmetrical operation of all three phases, it is expected that the hysteresis band width (HB) profiles HBa, HBb and HBc will be the same, but have phase difference as shown by the following formulae. $$HB_{a} = \left\{ \frac{0.125V_{de}}{f_{e}^{2}} \left[ 1 - \frac{4l^{2}}{V_{de}^{2}} \left( \frac{V_{so}}{l} + \frac{dr_{so}^{2}}{dt} \right)^{2} \right] \right\}$$ (18) $$HB_{b} = \left\{ \frac{0.125V_{de}}{f_{e}L} \left[ 1 - \frac{4L^{2}}{V_{de}^{2}} \left( \frac{V_{sb}}{L} + \frac{dV_{sb}^{2}}{dt} \right)^{2} \right] \right\}$$ (19) $$HB_{0} = \left\{ \frac{0.125V_{det}}{f_{c}L} \left[ 1 - \frac{0.25}{V_{det}^{2}} \left( \frac{V_{det}}{L} + \frac{dI_{det}^{2}}{dt} \right)^{2} \right] \right\}$$ (20) The calculated hysteresis bandwidth HB is applied to the variable HCC. The variable HCC created by S-R flip-flop to produce gate control pulses that operate the voltage source inverter. The above expression show that hysteresis bandwidth (HB) is a function of switching frequency, supply voltage, dc capacitor voltage and slope of the reference compensator current wave and thus with capacitor voltage and reference compensator current wave modulation, the switching frequency is remain nearly constant. This improves the performance of PWM and APF substantially. #### IV. SIMULATION RESULT A typical distribution feeder originating from a substation to Walchand College of Engineering Sangli campus load centers has been considered for simulation. A three-phase 11Kv/433V, Dy11 transformer is employed in the Institute for catering to the loads locally. Walchand College of Engineering, Sangli is getting supply from MSEDL through 11 KV feeder and their ratings are as follows: 11 KV feeder of length 5 km. H.T. Supply: 11 KV over head feeder: Mink 7/3.66 mm ACSR conductor; Resistance per Km. distance: 0.49 $\Omega$ , Reactance per Km. distance: 0.365 $\Omega$ ; Transformer rating: 125KVA, 11KV/433V, 50 Hz. Dy11; Percentage impedance: 4.25 ohms; All the system parameters are calculated and referred to low voltage side. These are as listed below: (With 20KA short circuit level) Copper losses in the transformer as 2000 Watts, Equivalent resistance Rp= 15.49 $\Omega$ and reactance Xp= 38.069 $\Omega$ . Total resistance R<sub>HV</sub>= 17.94 $\Omega$ and equivalent reactance $X_{HV}$ = 40.259 $\Omega$ . The equivalent values referring to L.V. sides are R<sub>LV</sub> = 0.0287 $\Omega$ , $X_{LV}$ =0.06431 $\Omega$ ; ( $X_{LV}$ =0.2047mH). #### 0.287+j0.06431 $\begin{array}{ccc} & & & Load \\ & & R_L & & X_L \end{array}$ 433V VS Fig.6.Parametric diagram on L.V. side All the parameters are shown in the equivalent diagram Fig. 6. Source voltage is considered as 440 v and of 50 Hz frequency. Filter parameters are selected as Lf=4 mH, Vdc=850 V, Cdc=1400 $\mu$ F. The performances of ramp comparison method, HCC and AHCC based shunt active power filter were evaluated through simulation using MATLAB/SIMULINK environment. A thyristor converter with R-L load is taken as t=0 to t=0.1 with resistance of 100 $\Omega$ and inductance of 50 mH under steady state and for remaining transient period R-L with 50 $\Omega$ and inductance of 25 mH. The simulation results in transient operation using AHCC are presented in Fig. 7 The Waveform of the source current without APF and its THD are shown in Fig. 7 (a) and (b) respectively. These current waveforms are for a particular phase (phase a). Other phases are not shown as they are only phase shifted by 120° and we have considered only a balanced load. Also the Waveform of the source current with APF and its THD are shown in Fig.7(c) and (d) respectively. The APF supplies the compensating current to PCC, which is shown in Fig. 7(e). The time domain response of the p-q theory controller is shown in Fig. 7(f) which clearly indicates that, the controller output settles after a few cycles. The capacitor voltage superimposed to its reference is shown in Fig. 7(f). In order evaluate the good performance of the control, the total harmonic distortion (THD) is measured for the source current before and after compensation. It shows that THD improves from 27.78% to 1.49 % using Adaptive hysteresis current controller with the APF. The system is simulated at different operating conditions such as thyristor convertor with firing angle of $0^0$ and $30^0$ . The final values of THD of source current before and after compensation are listed in Table I. The source current is giving better result using AHCC than other controllers and their THD's are below the specifications prescribed by IEEE 519 standard recommendations on harmonics levels Fig.7.a.Source current without APF ## TABLE I SOURCE CURRENT TOTAL HARMONIC DISTORTION: THD% | Type of current controller | Load THD% | | Source THD% | | | PQ before | PQ after | Vdc | | | |----------------------------|-----------------------|-------|-------------|------|------|-----------|---------------------|--------------------|-------|--| | $\alpha = 0^{\circ}$ | | | | | | | | | | | | SPWM | 25.79 | 25.8 | 25.79 | 1.6 | 1.6 | 1.61 | 10.22 KW, 1.56 KVAR | 10.27 KW, 4.97 VAR | 855.3 | | | HCC (h=0.9) | 25.76 | 25.76 | 25.76 | 1.57 | 1.59 | 1.64 | 10.23 KW, 1.55 KVAR | 10.27 KW, 7.1 VAR | 851.3 | | | AHCC | 25.78 | 25.78 | 25.78 | 1.49 | 1.51 | 1.48 | 10.22 KW, 1.55 KVAR | 10.27 KW, 1.29 VAR | 851.6 | | | $\alpha = 30^{0}$ | $\alpha = 30^{\circ}$ | | | | | | | | | | | SPWM | 29.99 | 30.02 | 30.01 | 2.92 | 2.95 | 2.96 | 8.51 KW, 4.01 KVAR | 8.54 KW, 7.9 VAR | 856.7 | | | HCC (h=0.9) | 30.00 | 30.02 | 30.01 | 2.71 | 2.86 | 2.84 | 8.504 KW, 4.00 KVAR | 8.54 KW, 9.33 VAR | 852.3 | | | AHCC | 30.00 | 30.03 | 30.01 | 2.79 | 2.83 | 2.77 | 8.5 KW, 4.00 KVAR | 8.54 KW, 3.4 VAR | 853.2 | | Fig.7.b.THD of source current without APF Fig.7.c.Source current with APF Fig.7.d.THD of source current with APF Fig.7.e.Compensating current Fig.7.f. Capacitor voltage superimposed to its reference #### V. CONCLUSION An AHCC has been implemented for three phase shunt active power filter. The instantaneous power theory is used to extract the reference currents from the distorted line currents. This facilitates enhancement of power quality through reactive power compensation and harmonics suppression due to nonlinear load. The results obtained indicate that DC-capacitor voltage and the harmonic current can be controlled easily for various load conditions. The performance of the AHCC, fixed HCC and ramp controller technique shunt active power filter are verified and compared with the simulation results. The THD of the source current after compensation is 1.49% which is less than 5%, the harmonic limit imposed by the IEEE-519 standard. #### REFERENCES - [1] H. Akagi, "New Trends in Active Filters for power conditioning", IEEE Tran. On Industry applications, Vol 32, No 6, Nov./Dec. 1996. - [2] A book on ." Uninterruptible power supplies and Active filters", Ali Emadi, Abdolhosein Nasiri and S. Bekiarov, CRC Press. - [3] H. Akagi," Modern active filters and traditional passive filters," Bulletin of the polish academy of science, Vol. 54, No. 3, 2006. - [4] Z. Peng, H. Akagi and A. Nabae, "A Novel Harmonic Power Filter," IEEE Power Electronics Specialists Conference RECORD, pp. 1151-1159, 1988. - [5] H. Akagi, Y. Kanazawa and A. Nabae, "Instantaneous Reactive Power Compensators Comprising Switching Devices without Energy Storage Components," IEEE Transactions on Industry Applications, Vol. IA-20, No. 3, pp. 625-630, 1984. - [6] Bhattacharya S., Divan, and B. Benejee, "Synchronous Reference Frame Harmonic Isolator Using Series Active Filter", 4th European Power Electronic Conf., Florence, Vol. 3, (1991):pp. 30-35. - [7] Saetieo S, Devaraj R, Tomey D. The design and implementation of a three phase active power filter based on sliding mode control. IEEE T Ind Appl 1995; 31:993 1000. - [8] Singh B, Al-Haddad K, Chandra A. Active power filter with sliding mode control. In: Proc Inst Elect Eng, Generation Transm Distrib; 1997, vol. 144. p. 564 8. - [9] Rastogi M, Mohan N, Edris A. Hybrid-active power filtering of harmonic currents in power systems. IEEE T Power Del 1995; 10(3): 1994 2000. - [10] Bhattacharya S, Veliman A, Divan A, Lorenz R. Flux based active power filter controller. In: Proc - IEEE-IAS Annual Meeting Record; 1995. p. 2483 91. - [11] Jou H. Performance compression of the threephase active power filters algorithms. In: Proc. Inst. Elect. Eng., Generation Transm. Distrib; 1995, vol. 142. p. 646\_52. - [12] Dixon J, Garcia J, Moran I. Control system for three-phase active power filters which simultaneously compensates power factor and unbalanced loads. IEEE T Ind Electron 1995; 42(6):636\_41. - [13] "Simulation and reliability analysis of shunt active power filter based on instantaneous reactive power theory", CUI Yu-long1, LIU Hong, WANG Jing-qin, SUN Shu-guang, Journal of Zhejiang University SCIENCE an ISSN ,2007. - [14] H. Akagi, A. Nabae and S. Atoh, "Control Strategy of Active Power Filters Using Multiple Voltage Source PWM Converters," IEEE Transactions on Industry Applications, Vol. IA-22, No. 3, pp. 460-465, 1986. - [15] Kale M. and Ozdemir E. "A Novel Adaptive Hysteresis Band Current Controller for Shunt Active Power Filter" IEEE 2003. - [16] Bose B.K., "An Adaptive Hysteresis Band Current Control Technique of a Voltage Feed PWM Inverter for Machine Drive System," IEEE Transact ion on Industrial Electronics, vol.37 no:5, pp.402-406, October 1990. # An Improved Speech Enhancement Algorithm Based on Spectral Subtraction with Tapering Method for Non-Stationary Noise Environments #### Riyaz.G, Rohini S.Hallikar & Dr.B.S.Kariyappa R.V.College of Engineering, Bangalore,India E-mail: mohammadriyaz.g@gmail.com rohinish@rvce.edu.in Abstract - One of the major noise type that is degrading the efficiency of most of the speech communication systems today is non stationary noise. Conventional spectral subtraction algorithms perform well in stationary noise environments but not in non stationary environments. In this paper we proposed an algorithm which is the combination of both spectral subtraction algorithm and tapering. That is output of Log Spectral Minimization (LSM) algorithm will be given to the tapering algorithm for better improvement in speech signal quality. LSM accounts non stationary noise, along with stationary noise. In the LSM algorithm autoregressive modeling has been used for codebook construction then log spectral distortion is calculated to search the codebook for speech and noise spectrum estimates. These estimated speech and noise spectrums are subtracted from the noisy signal to get the enhanced speech signal. LSM can adapt to varying levels of noise even while speech is present. Resultant signal from LSM is again fed to the tapering method to get clean enhanced signal. Tapering method has variance properties for power spectrum estimation. The proposed algorithm offers both noise adaptability and good variance properties for power spectrum estimation. Looking at the above advantages of LSM and tapering algorithms, theoretically a combination of LSM with tapering should perform better compared to LSM and tapering alone. Simulation results show that the proposed algorithm is superior to conventional algorithms. Keywords- non stationary noise, speech enhancement, auto regressive modeling, log spectral distortion, tapering, spectral subtraction. #### I. INTRODUCTION Speech is one of the most prominent and primary modes of interaction between human-to-human and human-to-machine communication in various fields. The present day speech communication systems are severely degraded due to various types of unwanted random sound which make the listening task difficult for a direct listener and cause inaccurate transfer of information [1] .Two types of common noise that are badly affecting the speech communication systems are stationary noise and non-stationary noise. There won't be many changes or sudden changes in the stationary noise spectrum overtime, where as non-stationary noise contains rapid or large changes in the spectrum over time. Speech enhancement is the method of removing or reducing noise to improve perceptual aspects of noisy speech (i.e. Quality and Intelligibility). Current speech enhancement algorithms are capable to estimate the stationary noise but they could not estimate the nonstationary noise. Most of the enhancement algorithms are having a speech processing stage called noise estimation which is important to remove the noise from noisy speech [2]. But the current algorithms are failing to perform in non stationary noise as they do not have the proper back ground noise estimation. Spectral subtraction, one of the best known speech enhancement which is computationally simple and can effectively remove the background noise from the noisy speech as it involves a single forward and inverse transform [3]. A number of improved methods for separating broad-band signals based on spectrum subtraction have been proposed. An improved version of Spectral Subtraction (SS) algorithm was published in [4] to minimize the annoying noise. This method uses two additional parameters namely, over-subtraction factor, and spectral floor parameter. The function of oversubtraction factor is to control the amount of noise power spectrum subtracted from the noisy speech power spectrum. And the introduction of spectral floor parameter prevents the spectral components of the resultant spectrum to fall below a preset minimum level rather than setting to zero. This implementation assumes that the noise affects the speech spectrum uniformly and the performance of this scheme is restricted in the usage of fixed value of subtraction parameters, which are difficult for other real-world noise. In real-world environment, the noise spectrum is non-uniform over the entire spectrum. To take into account the fact that colored noise affects the speech spectrum differently at different frequencies, a multiband linear frequency spacing approach to spectral oversubtraction [4] was presented in [5]. In this algorithm, the noisy speech spectrum is divided into N non-overlapping bands, and spectral subtraction is performed separately in each band. This algorithm re-adjusts the over-subtraction factor in each band. As the real-world noise is highly random in nature. So improvement in the multiband spectral subtraction algorithm for reduction of White Gaussian Noise (WGN) is required. In this paper, we propose a new spectral subtractive algorithm in which speech and noise spectrum are estimated based on log spectral minimization. In this algorithm Auto-Regressive (AR) modeling has been used to get the smooth spectrums of speech and noise. Codebook of speech and noise are generated using AR modeling. Then noise and speech spectrums are reconstructed by searching the code book. The estimated spectrums are subtracted from the noisy speech signal to get the enhanced speech. This enhanced speech is given to the tapering method for clean enhanced speech signal. The remaining part of the paper is organized as follows. Section 2 Present details about Log Spectral Minimization (LSM). Section 3 describes the Tapering algorithm along with LSM algorithm. Results comparing with LSM and Tapering are given in Section 4. Section 5 gives the conclusion drawn from this study. ### II. LOG SPECTRAL MINIMIZATION(LSM) ALGORITHM #### 1.1 CONVENTIONAL ALGORITHM The noisy signal can be modeled as a sum of the clean speech and the noise signal [6, 7] as $$y(n) = s(n) + d(n)$$ , $n = 0, 1, ..., (N-1)$ (1) Where n is the discrete time index, and N is the number of samples in a frame. Also y(n), s(n) and d(n) are the $n^{th}$ sample of the discrete-time signal of noisy speech, clean speech and random noise reactively. Although speech is non-stationary in nature whose spectral properties vary in time, usually the short-time Fourier transform (STFT) is used to divide the speech signal in small frames for further processing. Now representing the STFT of the time windowed signals by $Y_W(\omega)$ , $D_W(\omega)$ , and $S_W(\omega)$ (1) can be written as $$Y_W(\omega) = S_W(\omega) + D_W(\omega) \tag{2}$$ Where $\omega$ is the discrete frequency index of the frame. From the above expression (2), the spectral magnitude of noisy speech can be written as $$|Y_W(\omega)| = |S_W(\omega)| + |D_W(\omega)| \tag{3}$$ To obtain the short time spectrum of noisy speech signal $Y_W(\omega)$ will be multiplied by its complex conjugate $Y_W^*(\omega)$ . Then equation (3) can be written as $$|Y_W(\omega)|^2 = |S_W(\omega)|^2 + |D_W(\omega)|^2$$ (4) It is desired to choose $|\hat{S}_W(\omega)|$ that will minimize the error $$E_W(\omega) = \left| \left| \hat{S}_W(\omega) \right|^2 - \left| S_W(\omega) \right|^2 \right| \tag{5}$$ $$E_{W}(\omega) = \left| \left| \hat{S}_{W}(\omega) \right|^{2} - |Y_{W}(\omega)|^{2} + E\{|D_{W}(\omega)|^{2}\} \right|$$ (6) Expression (6) can be minimized by choosing $$\left|\hat{S}_{W}(\omega)\right|^{2} = |Y_{W}(\omega)|^{2} - \left|\widehat{D}_{W}(\omega)\right|^{2} \tag{7}$$ Where $|\hat{S}_W(\omega)|^2$ the short time spectrum of is estimated speech and $|\hat{D}_W(\omega)|^2$ is the average noise power which normally estimated updated at every speech frame. In this method, the subtraction process needs to be done carefully to avoid any speech distortion. The spectra obtained after subtraction process may contain some negative values due to inaccurate estimation of the noise spectrum. Half wave rectifier is frequently used in spectral subtraction algorithm due to its superior noise suppression capability. Thus, the complete algorithm is given by $$|\hat{S}_{W}(\omega)|^{2} = |Y_{W}(\omega)|^{2} - |\hat{D}_{W}(\omega)|^{2} \text{ if } |Y_{W}(\omega)|^{2} > |\hat{D}_{W}(\omega)|^{2}$$ $$= 0 \qquad \text{else} \qquad (8)$$ The enhanced speech is reconstructed by taking the inverse Short Time Fourier Transform (ISTFT) of the enhanced spectrum using the phase of the noisy speech as given below $$\hat{\mathbf{s}}_{W}(n) = ISTFT(|\hat{\mathbf{s}}_{W}(\omega)|.\exp(j\varphi_{V}(\omega))) \tag{9}$$ Expression (9) gives the enhanced speech signal. The block diagram of this conventional algorithm is shown in the fig1. Figure 1. Conventional Enhancement Algorithm #### 1.2 LSM ALGORITHM This algorithm consists of 3 stages i.e. Codebook construction stage, spectrum estimation stage and spectrum subtraction stage. #### 2.2.1 Code Book Construction Stage: In this stage clean speech corpus signals and noise corpus signals are considered. These signals are trained using AR modeling and AR coefficients are stored in codebooks. #### 2.2.1.1 AR modeling of speech and noise corpus In statistics and signal processing, an auto regressive (AR) model is a type of random process which is often used to model and predict various types of natural phenomena. The autoregressive model is one of a group of linear prediction formulas that attempt to predict an output of a system based on the previous outputs. The training of speech codebook and noisy codebook is based on AR model, using the smooth spectral to approximate the noisy signal. It is defined as $$x(n) = -m_1 x(n-1) - m_2 x(n-2) - \dots - m_N x(n-1) + \varepsilon(n)$$ (10) Where x(n) is output signal, $\varepsilon$ (n) is white noise with variance $\sigma_{\varepsilon}^2$ and mean zero and $(m_1, m_2, \dots, m_N)$ are the parameters of the process. The coefficients in the expression (10) can be evaluated using Yule-Walker method. Fig 2 shows the clean speech before and after AR modeling for 3 AR coefficients After the estimating the coefficients, we can construct the code book entities using the expression $$|m_{\chi}(\omega)| = |1 + m_1 e^{-j\omega} + m_2 e^{-2j\omega} + \dots + m_N e^{-Nj\omega}|$$ (11) #### 2.2.2 Spectrum Estimation Stage Spectrum estimation stage can be further divided in to 2 stages i.e. Variance Estimation stage and Codebook Searching Stage. Fig2 (a) FFT Spectrum (b) AR Modeled spectrum #### 2.2.2.1 Variance Estimation Stage After the codebook generation stage the variance of clean speech signal and noise signal are calculated for each combination of frames using $$\begin{bmatrix} \int \frac{|m_{y}(\omega)|^{4}}{\sigma_{y}^{2}|m_{s}(\omega)|^{4}} d\omega & \int \frac{|m_{y}(\omega)|^{4}}{\sigma_{y}^{2}|m_{s}(\omega)|^{2}|m_{d}(\omega)|^{2}} d\omega \\ \int \frac{|m_{y}(\omega)|^{4}}{\sigma_{y}^{2}|m_{s}(\omega)|^{2}|m_{d}(\omega)|^{2}} d\omega & \int \frac{|m_{y}(\omega)|^{4}}{\sigma_{y}^{2}|m_{d}(\omega)|^{4}} d\omega \end{bmatrix}^{-1} \begin{bmatrix} \int \frac{|m_{y}(\omega)|^{2}}{|m_{s}(\omega)|^{2}} d\omega \\ \int \frac{|m_{y}(\omega)|^{2}}{|m_{d}(\omega)|^{2}} d\omega \end{bmatrix} \begin{bmatrix} \sigma_{s}^{2} \\ \sigma_{d}^{2} \end{bmatrix}$$ $$= (12)$$ Where $\sigma_y^2$ and $m_y(\omega)$ are the variance and codebook of noisy speech respectively. #### 2.2.2.2 Code Book Searching Stage After estimating the variance from above stage, we will calculate for log-spectral distance for every combination of clean speech and noise signal frames using $$d_{LS} = \frac{1}{2\pi} \int \left| \ln \left( \frac{\sigma_s^2}{|m_s(\omega)|^2} + \frac{\sigma_d^2}{|m_d(\omega)|^2} \right) - \ln \left( \frac{\sigma_y^2}{|m_y(\omega)|^2} \right) \right|^2 d\omega$$ (13) The combination which gives minimum $d_{LS}$ is used to estimate the noise and speech spectrums using the expressions $$\hat{S}_{AR}(\omega) = \frac{\sigma_s}{|m_s(w)|} \qquad \widehat{D}_{AR}(\omega) = \frac{\sigma_d}{|m_d(\omega)|}$$ (14) The same method is repeated for every combination of variance obtained from above stage for each frame combination of speech and noise. #### 2.2.3 Spectral Subtraction Stage Speech and noise spectrums that are estimated in the above stage are used to subtract from the noisy speech spectrum which is obtained by applying Short Time Fourier Transform (STFT) .The spectral subtraction is carried as follows $$\hat{S}(\omega) = (|Y(\omega)| - |\widehat{D}(\omega)|)e^{j\phi Y(\omega)}$$ $$= \left(1 - \frac{|\widehat{D}_{AR}(\omega)|}{|\widehat{S}_{AR}(\omega)| + |\widehat{D}_{AR}(\omega)|}\right) \cdot Y(\omega) = H(\omega)Y(\omega)$$ Where $$H(\omega) = 1 - \frac{|\hat{D}_{AR}(\omega)|}{|\hat{S}_{AR}(\omega)| + |\hat{D}_{AR}(\omega)|}$$ (16) Where $Y(\omega)$ and $\hat{S}(\omega)$ are the noisy speech spectrum and enhanced speech spectrum respectively, $\hat{S}_{AR}$ and $\hat{D}_{AR}$ are the best speech and noisy entry choose from codebooks by log spectral minimization. After obtaining the value of $\hat{S}(\omega)$ , time domain signal can be obtained as below $$\hat{s}(n) = ISTFT(|\hat{s}(\omega)|, \exp(j\varphi_{\nu}(\omega)))$$ (17) Where $Y(\omega)$ and $\hat{S}(\omega)$ is the enhanced speech signal, is the phase information of noisy speech signal. ### III. LSM ALGORITHM WITH TAPERING ALGORITHM The multi taper method shows good bias and variance properties for power spectrum estimation. Direct spectrum estimation based on Hamming windowing is the most often used power spectrum estimation method for speech enhancement[8]. Windowing does reduce bias but not the variance of the spectral estimate. The idea behind the multi taper spectrum estimator is to reduce this variance by computing a small number of L direct spectrum estimators each with a different taper, and then average the L spectral estimates. The multi taper spectrum estimator is given by $$s(\omega) = \frac{1}{L} \sum_{k=0}^{L-1} \left| \sum_{t=0}^{N-1} h_K(t) x(t) e^{-j\omega t} \right|^2 (18)$$ Where N is the taper length and $h_{\square}$ (t) is the k-th data taper used for the spectral estimate. The tapers $h_{\square}$ (t) are chosen to be orthonormal. One of the orthogonal families of tapers is the sine tapers given by $$h_k(m) = \sqrt{\frac{2}{N+1}} \sin\left(\frac{\pi k m}{N+1}\right) \qquad m = 1, \dots, N \quad (19)$$ For the further smoothing of spectrum wavelet thresholding can be used. The underlying idea behind these techniques is to represent the estimated log spectra as a signal plus noise, where the signal is the true log spectrum and noise is the estimation error. If the noise is Gaussian, then standard wavelet denoising techniques can be used to eliminate the noise and obtain better spectral estimates. Speech enhancement is done using this thresholding technique. The block diagram of proposed algorithm is shown in below figure 3. Noisy signal that is given to the LSM algorithm will be modelled with AR modelling technique. The speech and noise estimates are determined using log spectral Fig 3. Block diagram of LSM with Tapering algorithm distortion of noisy entities with clean speech corpus entities and noise corpus entities according to equation (13). These estimates are subtracted from the noisy speech spectrum to get the enhanced signal. Enhanced signal that is obtained in the LSM algorithm is fed to the Taper technique for further smoothing of the signal for better SNR of speech signal. #### IV. EXPERIMENTAL RESULTS To evaluate the proposed algorithm, we have considered clean speech signal of letters "y","b" is shown in figure 4.a. Noise signals babble,F16,M109 are taken from NOISEX-92 Corpus. Speech signals and noise signals are sampled at 8kHz and STFT are computed with a 30ms rectangular Window without overlapping. We created databases of noisy signals with different noise types. For our work we choose noise types F16, Babble, m109. Signals were processed using the LSM algorithm and Tapering algorithms separately, and another in processing was the proposed method, which used a combination of LSM with Tapering algorithm. In this method the output of LSM algorithm is given to the tapering method. For subjective evaluation of the algorithms we have considered Overall SNR. Overall SNR of an enhanced signal can be given by $$SNR = 10 * log 10(\frac{\sum c(t)^2}{\sum (c(t) - e(t))^2})$$ (20) Table1.SNR values for above given algorithms | | | tot above g | | | | |-------------------|--------|-------------|--------|--------|--------| | Algorithm | Noise | SNR=-5 | SNR=0 | SNR=5 | SNR=10 | | LSM | F16 | 1.9453 | 3.6366 | 7.6441 | 7.7450 | | Tapering | F16 | 2.7052 | 4.3716 | 7.8334 | 7.9994 | | LSM with tapering | F16 | 3.5139 | 5.7152 | 8.1715 | 9.4241 | | LSM | M109 | 2.4329 | 3.8106 | 5.5749 | 7.4289 | | Tapering | M109 | 2.8943 | 4.0165 | 6.7615 | 8.1965 | | LSM with tapering | M109 | 3.5205 | 5.3217 | 7.3051 | 8.8608 | | LSM | Babble | 2.4236 | 4.2717 | 6.3097 | 8.1068 | | Tapering | Babble | 3.4114 | 5.4145 | 7.0126 | 8.4064 | | LSM with tapering | Babble | 3.8683 | 6.1622 | 8.1519 | 9.4029 | Where c(t) is the clean speech, e(t) is the enhanced speech .This SNR is calculated for LSM, Tapering and LSM followed by Tapering. These overall SNR values are given in the table 1. The clean speech signal shown in figure 4.a is added with above taken noise types at different SNR levels as shown in table 1.The figure 4.b shows the signal added with babble noise at -5 dB SNR. Then output signals of LSM, Tapering and LSM with tapering methods are shown in figures 4.c ,4.d and 4.e respectively. And the respective spectrograms are shown in figure 5. Fig 4.a Clean speech of letters "y","b" Fig 4.b Noisy speech of letters "y", "b" Fig 4.c Enhanced speech after LSM algorithm alone Fig 4.d Enhanced speech after Tapering algorithm alone Fig 4.e Enhanced speech after LSM with tapering technique Figure 5.a Clean speech Fig 5.bNoisy signal Fig 5.c Enhanced signal after LSM algorithm Fig 5.d Enhanced speech after Tapering alone Fig 5.e Enhanced speech after LSM with tapering technique #### V. CONCLUSION The proposed algorithm can be used in a realistic environment where the background noise is due to non stationary environments. Referring the results of the table 1, the proposed method shows an improved performance compared to conventional methods. Proposed algorithm shows 29% SNR improvement compared with tapering method in case of F16 noise environment with -5dB SNR and 44% SNR improvement compared with LSM algorithm in case of Babble noise environment with 0dB SNR. #### **ACKNOWLEDGEMENT** The authors wish to thank Dr. M. Uttara Kumari, HOD & Professor of E.C.E Department, RV College of Engineering Bangalore and Dr.K. Padma Raju, Principal & Professor of JNTU College of Engineering Kakinada for stimulating this area of research and for their helpful discussions & encouragement throughout the course of this work. #### REFERENCES - [1] Y. Ephraim, "Statistical model based speech enhancement systems," in Proceedings of IEEE, vol. 80, no.10, pp.1526-1555, Oct. 1992. - [2] Philipos C.Loizou , Gibak Kim "Reasons why current speech enhancement algorithms do not improve speech intelligibility and suggested solutions" IEEE transactions on Audio, Speech and Language Processing, vol 19,No. 1,January 2011 - [3] Navneet Upadhyay and Abhijit Karmakar "The Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: A Review" International Journal of Research and Reviews in Signal Acquisition and Processing, Vol. 1, No. 3, September 2011 - [4] M. Berouti, R.Schwartz, and J. Makhoul, "Enhancement of speech corrupted by acoustic noise," in Proceedings of International Conference on Acoustic, Speech, and Signal Processing, pp. 208-211, April 1979. - [5] S. Kamath, and P. Loizou, "A multi-band spectral subtraction method for enhancing speech corrupted by colored noise," in Proceedings of International Conference on Acoustic, Speech, and Signal Processing, Orlando, USA, vol. 4, pp. IV-4164, May 2002. - [6] S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transaction on Acoustic, Speech, and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979. - [7] S. F. Boll, "A spectral subtraction algorithm for suppression of acoustic noise in speech," in Proceedings of International Conference on Acoustic, Speech, and Signal Processing, pp. 200-203, 1979. - [8]. Yi Hu and Philipos C. Loizou," A New Speech Enhancement Method based on wavelet thresholding the multi taper spectrum", i n part by Grant No. R01 DC03421 from NIDCD/NIH,2003. ### **Detection of Breast Cancer Using Microwave Absorption Loss** #### Ponnuraj Kirthi Priya & S. Poonguzhali Department of ECE, College of Engineering Guindy, Anna University, Chennai, India E-Mail: pkirthipriya7@yahoo.co.in & poongs@annauniv.edu Abstract - The search for alternative methods of breast cancer detection that is more accurate and less harmful has led to the possibility of using microwaves. Electromagnetic (EM) simulations of normal and abnormal breasts indicate that the absorption loss for a cancerous breast is higher than a normal breast and this can be used to detect breast cancer. This paper presents a novel method for detecting a tumor by illuminating the breast with an ultra wideband signal of 3.3 to 8.2 GHz and identifying the coordinates of maximum value of specific absorption rate (SAR). Results show that at 4 GHz these coordinates could detect a 5 mm tumor at five different positions within the breast and the same was obtained on increasing the size of the breast. Also, as the mass of the normal breast increases the absorption loss values increase. Keywords-component; Specific absorption rate (SAR); ultrawideband (UWB); microwave detection; breast cancer. #### I. INTRODUCTION Breast cancer is a matter of high concern to women in recent days. Its early detection is the best way to combat it and increase survival rates. The method widely in use for the detection of breast cancer is x-ray mammography and while it produces good quality images at low doses of radiation it has been shown to have particular limitations significantly a number of false negative [1] and positive diagnosis [2]. It can be used only for women above the age of 40. Further, it is uncomfortable and the breast has to be repositioned for different views. Other modalities have been less successful in the detection of breast cancer. For example, ultrasound can only image dense breast tissues to detect if a tumor is a fluid filled cyst or solid mass but cannot image areas deep inside the breast. Magnetic resonance imaging (MRI) is excellent at imaging around breast implants but it is expensive. A search for alternative low cost technique to detect breast cancer at its early stage has led to the possibility of using microwaves as it is noninvasive, avoids exposure to ionizing radiation and does not require compression of the breast. This technology depends on the detectable intrinsic contrast in dielectric properties of a tumor and its surrounding normal breast tissue at microwave frequencies [3]. Since this contrast is present at the early stages of development of the tumor, it is highly suitable for diagnosis of breast cancer at the starting stages. Microwave imaging aims to reconstruct an image of the breast by mainly using two different approaches microwave tomography [4] and radar based imaging [5]. Microwave detection deals with determining if a tumor is present in a breast. Yusoff et al. [6] characterized the absorption loss of heterogeneous normal and tumor affected tissues using ultra wideband signals. Their results show that as the frequency increases, the absorption loss increases and this varies with change in electrical properties. Elsewe [7] introduced the idea of using electromagnetic (EM) absorption loss for detecting breast cancer. Their simulations show that at 915 MHz the absorption loss for an infected breast is higher than that of the normal breast and is least affected by tumor location. Also, this frequency had the best linear curve fit and resolution. Specific absorption rate (SAR) is the rate at which energy is absorbed in a body tissue and has the unit W/Kg [8]. In the following section, we demonstrate the feasibility of using the coordinates of the maximum value of SAR for detecting the location of tumors inside the breast of different sizes. The absorption loss between a normal breast and tumor infected breast is determined and compared. The absorption values for different tumor locations at different frequencies are also analyzed. #### II. MODELS #### A. System Configuration The EM model is created using CST microwave studio which is a specialized tool for simulation of high frequency problems. It consists of a breast phantom and an ultra wideband antenna placed below the phantom as illustrated in figure 1. The breast is modeled in the prone position as a hemi-sphere of homogeneous breast tissue of variable radii 50 and 60 mm. The skin layer is not included to decrease the complexity of the model and to reduce the simulation run time. A 5 mm tumor is embedded into the breast model at 6 different locations to study its effect on absorption loss. The density of the breast is 928 kg/m³ and that of the tumor is 1041 kg/m³ [9]. The frequency dependence of the dielectric constant and conductivity over the entire bandwidth was incorporated in the model using the first order Debye dispersion [10]. The Debye parameters selected according to the published data for normal ( $\epsilon_s$ =10, $\epsilon_\infty$ =54, $\sigma_s$ =7ps) and malignant ( $\epsilon_s$ =54, $\epsilon_\infty$ =4, $\sigma_s$ =7ps) tissue. Figure 1. Breast Model. #### B. Antenna Model A wide slot antenna was placed 5 mm away from the breast to allow for good penetration. One side of the substrate consists of a rectangular slot set in a ground plane of size 56 mm by 57 mm. On the other side a 50 Ohms micro strip feed line with a fork-shaped tuning stub is placed symmetrically with respect to the center line of the wide slot. This fork feed increases the operational bandwidth [11]. The height of the dielectric substrate is 0.82 mm and its relative permittivity is 3.38. The antenna is excited with a Gaussian pulse and the simulated return loss of the wide slot antenna is below -10 dB between 3.3 GHz and 8.2 GHz as shown in figure 3. The input power to the antenna is 1Watt rms. Figure 2. Wide slot antenna geometry (not to scale). Figure-3 Simulated S<sub>11</sub> characteristics of the wide slot antenna. #### C. Specific Absorption Rate The absorption loss is derived from Specific Absorption Rate. The Specific Absorption Rate (SAR) is defined as the time derivative of the incremental energy (dW) absorbed by an incremental mass (dm) contained in a volume element (dV) of a given mass density $(\rho)$ . It is expressed as [12] $$SAR = \frac{d}{dt} \left( \frac{dW}{dm} \right) = \frac{d}{dt} \left( \frac{dW}{\rho dV} \right)$$ (1) As specified by IEEE C95.3 standard the typical local SAR value is averaged in 1 g tissue mass. The normal breast model of radius 50 mm has a tissue mass of 0.242952 kg and the same breast model with a 5 mm tumor has a tissue mass of 0.243012 kg. For a 60 mm radius the normal breast model has a tissue mass of 0.419822 kg and the abnormal breast has a tissue mass of 0.419881 kg. The total absorbed power in these models is calculated using the following equation $$P_{abs} = SAR_{total} \times Tissue mass$$ (2) #### III. RESULTS AND DISCUSSIONS Table I and II summarizes the values and coordinates of total SAR and maximum SAR obtained using equation (1) in a normal and abnormal breast respectively. It is observed that both the total and maximum SAR values are higher in the tumor affected breast compared to the normal breast. It is observed that the coordinates of maximum value of SAR point to the position of the tumor within the tumor affected breast. This indicates that the maximum local SAR distributions occur in the tumor whereas in the normal breast the coordinates of maximum value of SAR point to locations close to the breast surface indicating that the maximum local SAR distributions occur in the breast layer close to the antenna. TABLE -1 VALUES AND COORDINATES OF 1-G AVERAGED LOCAL SAR IN NORMAL BREAST MODEL TYPE STYLES | Frequency<br>(GHz) | Total<br>SAR<br>(W/Kg) | Max<br>SAR<br>(W/Kg) | Max at x,y,z<br>(mm) | | |--------------------|------------------------|----------------------|----------------------------------------------|--| | 4 | 2.02216 | 16.4851 | -0.46, -29.32, 1.22 | | | 5 | 1.90832 | 17.7831 | -0.46, -44.71, 8.88 | | | 6 | 1.78657 | 23.1156 | -0.465, -44.71, 8.88 | | | 8 | 1.67803<br>1.55828 | 24.5803<br>27.9862 | -0.465, -44.71, 6.94<br>-0.465, -44.71, 5.98 | | TABLE -II VALUES AND COORDINATES OF 1-G AVERAGED LOCAL SAR IN BREAST MODEL WITH TUMOR | Frequency<br>(GHz) | Total<br>SAR<br>(W/Kg) | Max<br>SAR<br>(W/Kg) | Max at x,y,z<br>(mm) | |--------------------|------------------------|----------------------|----------------------| | 4 | 2.17212 | 64.2304 | 0.23, -25.20, 0.203 | | 5 | 1.9835 | 39.2782 | 0.23, -25.20, 0.20 | | 6 | 1.80413 | 38.1918 | -0.23, -25.20, 0.61 | | 7 | 1.68272 | 26.4515 | 0.23, -25.20, 0.61 | | 8 | 1.56872 | 27.9589 | -0.23, -44.69, 5.33 | From the total SAR values the absorption loss is calculated using equation (2). Figure 4 compares the absorption loss between the normal and abnormal breast model. As the frequency increases, the penetration depth decreases and thus less power is absorbed. This is indicated by the decrease in SAR values and absorption loss in both the models. At 4 GHz the maximum variation between the two models is 0.31 dB due to the presence of a 5 mm tumor. If an accepted range of values for absorption loss is determined in a normal breast. Any deviation from these values will help us in determining that a tumor is present in the breast. No attempt to determine this range is made here. Figure -3 Comparison of absorption loss in normal and abnormal breast. Table III summarizes the coordinates of the maximum value of SAR when the tumor is placed inside the breast at six different locations. Out of the six, 5 locations of the tumor were detected successfully. On changing the size of the breast to 60 mm, it was observed that the coordinates of the maximum value of SAR was still able to detect the location of the tumor. This is shown in table IV. Thus, by first establishing that an abnormality is present we can then try to identify the location of the tumor. TABLE – III DETECTION OF A 5MM TUMOR OCATED IN A 50MM BREAST MODEL AT 4 GHZ | Actual position of tumor at x,y,z (mm) | Detected position of tumor at x,y,z (mm) | |----------------------------------------|------------------------------------------| | 0,-25,0 | 0.2325, -25.2083, 0.20375 | | 0,-10,0 | 0.2325, -8.54167, -0.208333 | | -5,-40,5 | -4.73611, -38.9583, 4.78938 | | 20,-15,-15 | 19.7917, -14.375, -14.375 | | -15,-35,20 | -14.375, -35.2083, 18.9583 | | -20,-30,0 | -0.465, -28.5417, 1.42625 | TABLE – IV DETECTION OF A 5MM TUMOR LOCATED IN A 60MM BREAST MODEL AT 4 GHZ | Actual position of<br>tumor at x,y,z<br>(mm) | Detected position of tumor at x,y,z (mm) | |----------------------------------------------|------------------------------------------| | 0,-30,0 | -0.2325, -29.7917, 0.20375 | | 0,-10,0 | 0.2325, -8.54167, -0.208333 | | -5,-50,5 | -5.1875, -50.2083, 4.36812 | | 20,-15,-15 | -0.465, -50.891, 9.84667 | | -15,-35,20 | -14.7917, -33.9583, 19.375 | | -20,-30,0 | -19.7917, -29.7917, -0.208333 | Figure 5 compares the absorption loss between a breast of 50 mm and 60 mm radius. It is observed that when the size of the breast increases the absorption loss also increases for different frequencies. This shows that a different range of values is to be established for different breast masses. omparison of absorption loss in radii of 50 and 60 mm breast. Table V summarizes the absorption loss values for 5 different tumor locations at five different frequencies. The standard deviation of absorption loss was found to be lowest at 8 GHz indicating that it was the least affected by tumor location. However, the standard deviation of absorption loss was highest at 4 GHz indicating that it was dependent on the tumor locations. Thus, this was the best frequency for detecting the tumor among the five frequencies. TABLE –V ABSORPTION LOSS FOR DIFFERENT TUMOR LOCATIONS IN THE 50MM BREAST MODEL AT DIFFERENT FREQUENCIES | Tumor | 4GHz | 5GHz | 6GHz | 7GHz | 8GHz | |--------------------|--------|--------|--------|--------|--------| | Location at | | | | | | | x,y,z | | | | | | | (mm) | | | | | | | 0,-25,0 | 0.5278 | 0.4820 | 0.4384 | 0.4089 | 0.3812 | | 0,-10,0 | 0.5117 | 0.4854 | 0.4375 | 0.4125 | 0.3806 | | -5,-40,5 | 0.4964 | 0.5003 | 0.4516 | 0.4193 | 0.3758 | | 20,-15,-15 | 0.4847 | 0.4623 | 0.4322 | 0.4057 | 0.3765 | | -15,-35,20 | 0.4963 | 0.4738 | 0.4399 | 0.4097 | 0.3782 | | -20,-30,0 | 0.4924 | 0.4659 | 0.4336 | 0.4070 | 0.3757 | | Standard deviation | 0.0156 | 0.0140 | 0.0069 | 0.0049 | 0.0024 | #### **CONCLUSION** The above simulated data clearly shows that the SAR and the absorption loss values are higher in a cancerous breast and these values increase as the mass of the breast increases. At 4 GHz, the penetration depth was the maximum and its absorption loss was location dependent. The coordinates for maximum value of SAR can be used as an indication for locating a tumor as it was able to detect 5 out of the 6 positions on changing the tumor location. The same was found on increasing the breast diameter. All these indicate a possibility of using EM absorption loss for detecting breast tumors. Future work is needed to develop realistic breast models and expand this method by determining an acceptable range of absorption values for normal breast tissues. #### REFERENCES - 1) P. T. Huynh, A. M. Jarolimek, and S. Daye, "The false-negative mammogram," Radiograph., vol. 18, no. 5, pp. 1137–1154, 1998. - J. G. Elmore, M. B. Barton, V. M. Moceri, S. Polk, P. J. Arena, and S. W. Fletcher, "Ten-year risk of false positive screening mammograms and clinical breast examinations," New Eng. J. Med., vol. 338, no. 16,pp. 1089–1096, 1998. - A. J. Surowiec, S. S. Stuchly, J. R. Barr, and A. Swarup, "Dielectric properties of breast carcinoma - and the surrounding tissues," IEEE Trans. Biomed. Eng., vol. BME-35, pp. 257–263, Apr. 1988. - P. M. Meaney, M. W. Fanning, D. Li, S. P. Poplack, and K. D. Paulsen, "A clinical prototype for active microwave imaging of the breast," IEEE Trans. Microwave Theory Tech., vol. 48, pp. 1841–1853, Nov. 2000. - S. C. Hagness, A. Taflove, and J. E. Bridges, "Two-dimensional FDTD analysis of a pulsed microwave confocal system for breast cancer detection: Fixed-focus and antenna-array sensors," IEEE Trans. Biomed. Eng., vol. 45, no. 12, pp. 1470–1479, Dec. 1998. - N.I.M. Yusoff, S. Khatun, S.A. AlShehri, "Characterization of Absorption Loss for UWB Body Tissue Propagation Model," in *International Conference on Communications (MICC)*, pp. 254-258, 2009. - 7) Mohammed M. Elsewe, "Evaluation of EM Absorption Loss over Breast Mass for Breast Cancer Diagnosis," in *International Conference on Engineering in Medicine and Biological Society (EMBS)*, pp. 3897-3900, 2011. - 8) "International Commission on Non-Ionizing Protection (ICNIRP) 1998 "Guidelines for limited exposure to time varying electric, magnetic and electromagnetic fields (up to 300 GHz)", Health Physics, vol. 74, no. 4, pp. 494-522, Apr 1998. - 9) A. J. Fenn, *Breast Cancer Treatment by Focused Microwave Thermotherapy*. Sudbury, MA: Jones and Bartlett Publishers, 2007, pp. 39-42. - 10) E. Zastrow, S. K. Davis, and S. C. Hagness, "Safety assessment of breast cancer detection via ultrawideband microwave radar operating in pulsed-radiation mode," Microwave and Optical Technology Letters, vol. 49, no. 1, pp. 221-225, Jan. 2007. - 11) Begaud Xavier, "Ultra wideband wide slot antenna with band-rejection characteristics," in *European Conference on Antennas and Propagation (EUCAP)*, pp. 1-6, 2006. - 12) Recommended Practice for Measurements and Computations of Radio Frequency Electromagnetic Fields With Respect to Human Exposure to Such Fields, 100 kHz-300 GHz, IEEE Standard C95.3, 2002. ## Design and Implementation of Embedded Control Warning System for Vehicle Reversing #### #R.Rani & \* M. Sudhakara Reddy #Dept of ECE, SVCET, Chittoor. \*Dean, School of Electrical Sciences. Email: ranikps30@gmail.com & msreddy36@gmail.com **Abstract** - Most of the car drivers used the reverse radar orareverse camera to detect the road situation behind the vehicle when it is engaged in reverse gear. As a matter of fact, the pedestrians can virtually know if the vehicle is backing up or not only by seeing the permanent bright reverse lamps. And as there is not much change with the reverse lamp to be seen, therefore their warning function for pedestrians seems to be still insufficient eventually. Not only the warning feature of the reverse lamps is virtually not sufficient but their function will be influenced owing to the different installation positions. Hence we propose the new technology to overcome this issue #### I. INTRODUCTION There are several car back-up warning systems which are currently used in the market at present such as reverse radar, reverse camera, and reverse alarm audio system, etc. and the function of the reverse radar is used to remind the driver of the distance behind the driving vehicle and the obstacle while the reverse camera is used to let the driver see the situation behind the driving vehicle without needing to turn around driver's head. Apparently, either of them is used for driver's purpose only. There is nothing to do with the pedestrians. As the pedestrians can only know if the vehicle is in reverse gear or not only by seeing the permanent bright reverse lamps. And it is very difficult for them to know the actual situation well. This research tries to design a set of embedded intelligent car backup warning system so as to promote the safety of the walkers or the other drivers on the road. By using microcontroller to transform the signal from the ultrasonic sensor and LDR sensors. And the angle of the LED reverse lamp bracket is adjusted and driven automatically according to the results of this logic deduction eventually. This research tries to do the test by using a mobile frame in the same height as a real automobile. Installation angle will be changed correspondently with the distance between test mobile frame and obstacle and being declined automatically from 90 degree to 0 degree. Apparently, from the test results, it has been proven that this system can reach the goal of automatically controlled car back-up warning function truly. #### **Block diagram** #### Working In this circuit we are using the LDR for time sensing purpose, whether the time is day or night. So the LDR will sense the time, if it is night when the car is get reversed the ultrasonic sensor will sense the obstacles which are the behind the vehicle and it will rotate the LED lamp with motor according to the distance. If it is low distance it will rotate the lamp with that much angle it will flash the light on humans and alert the pedestrients. so the pedestrian come to know that the vehicle is coming behind so he can get alert.so it can alert the driver by using the buzzer and the obstacle information also displayed on LCD display. If the obstacle is very nearer to the car then Break will be applied,so that the speed of the car is decreases and automatic break also performed. #### Results Output obtained while executing the program developed by using keil software #### **Advantages** #### > Automatic vehicle control The vehicle is automatically stops when it detects the obstacle is very nearer. #### > Fast response The sensors senses the obstacle very fastly. #### > High reliable #### > System will be stable for a long time The sensors senses the obstacle continuously and gives the distance between the vehicle and obstacle. #### **Applications** #### **Future Scope** In future the extention of our project is an autonomous car which is in research. An **autonomous car**, also known as **robotic** or informally as **driverless**, is an autonomous vehicle capable of fulfilling the human transportation capabilities of a traditional car. As an autonomous vehicle, it is capable of sensing its environment and navigating on its own. A human may choose a destination, but is not required to perform any mechanical operation of the vehicle. Autonomous cars are not in widespread use, but their introduction could produce several direct advantages: - Fewer crashes, due to the autonomous system's increased reliability compared to human drivers - Increased roadway capacity due to reduced need of safety gaps and the ability to better manage traffic flow. - Relief of vehicle occupants from driving and navigation chores. - Removal of constraints on occupant's state it would not matter if the occupants were too young, too old or if their frame of mind were not suitable to drive a traditional car. Furthermore, disabilities would no longer matter. - Elimination of redundant passengers humans are not required to take the car anywhere, as the robotic car can drive empty to wherever it is required. - Alleviation of parking scarcity as cars could drop off passengers, park far away where space is not scarce, and return as needed to pick up passengers. Indirect advantages are anticipated as well. Adoption of robotic cars could reduce the number of vehicles worldwide, reduce the amount of space required for vehicle parking, and reduce the need for traffic police and vehicle insurance. Autonomous vehicles sense the world with such techniques as laser, radar, lidar, GPS and computer vision. Advanced control systems interpret the information to identify appropriate navigation paths, as well as obstacles and relevant signage. Autonomous vehicles typically update their maps based on sensory input, such that they can navigate through uncharted environments. #### REFERENCES 1] A. Shalom Hakkert, Victoria Gitelman, Eliah Ben-Shabat, "An evaluation of crosswalk - warning systems: effects on pedestrian and vehicle behavior," Transportation Research Institute, Technion— Israel Institute of Technology, Technion City, Haifa 32000, Israel, Transportation Research Part F 5 (2002) 275–292. - [2] Hitoshi Miyata, Makoto Ohki, Yasuyuki Yokouchi, Masaaki Ohkita , "Control of the autonomous mobile robotDREAM-1 for a parallel parking," Department of Electrical and Electronic Engineering, Faculty of Engineering, Tottori University, 4-101, Koyama-Minami, Tottori 680, Japan, Mathematics and Computers in Simulation 41 (1996) 129-138 - [3] Nikolaj Zimic, Miha Mraz, "Decomposition of a complex fuzzy controller for the truck-and-trailer reverse parking problem," University of Ljubljana, Faculty of Computer and Information Science, Trzaska cesta 25, SI-1000 Ljubljana, Slovenia, Mathematical and Computer Modelling 43 (2006) 632–645 - [4] Massaki Wada, Student Member, "Development of Advanced Parking Assistance System," IEEE, Kang Sup Yoon, Member, IEEE, and Hideki Hashimoto, Member, IEEE, IEEE Transactions on Industrial Electronics, VOL. 50, NO. 1, Feb 2003 - [5] Tsung-hua Hsu, Jing-Fu Liu, Pen-Ning Yu, Wang-Shuan Lee and Jia-Sing Hsu, "Development of an Automatic Parking System for Vehicle," Automotive Research and Testing Center, Changhua County, Taiwan, R.O.C., IEEE Vehicle Power and Propulsion Conference (VPPC), September 3-5, 2008, Harbin, China - [6] F. Gomez-Bravo, F. Cuesta, A. Ollero, "Parallel and diagonal parking in nonholonomic autonomous vehicles," Departamento Ingenier□a de Sistemas y Automatica, Escuela Superior de Ingenieros, Universidad de Sevilla, Camino de los Descubrimientos, E-41092 Seville, Spain, Engineering Applications of Artificial Intelligence 14 (2001) 419–434 - [7] Yanan Zhao, Emmanuel G. Collins Jr., "Robust automatic parallel parking in tight spaces via fuzzy logic," Department of Mechanical Engineering, Florida A&M University–Florida State University, Tallahassee, FL, USA, Robotics and Autonomous Systems 51 (2005) 111–127 - [8] Michael Sivak, Michael J. Flannagan, Toshio Miyokawa, "The use of parking and auxiliary lamps for traffic sign illumination," University of Michigan Transportation Research Institute, 2901 Baxter Road, Ann Arbor, MI 48109-2150, USA, Journal of Safety Research 32 (2001) 133– 147. - [9] G. Bruzzone, M.Caccia, G.Ravera, A.Bertone, Standard Linux for embedded real-time robotics and manufacturing control systems, Robotics and Computer-Integrated Manufacturing 25 pp. 178– 190, 2009. - [10] R.E. Haber , J.R. Alique , A. Alique , J. Herna'ndez , R. Uribe-Etxebarriac , Embedded fuzzy-control system for machining processes Results of a case study , Computers in Industry 50 , pp. 353–366, 2003. ### Software Design and Development of Online Monitoring System in SOC Encounter #### A. Anasuyamma & T.Sanath kumar ASCET,Gudur, Andhra Pradesh, India, E-mail: anasuya404@gmail.com Abstract - This paper presents the design and development of an online monitoring system that collects data over time or distance with a built-in instrument or sensors. Currently, an online data collection system (Vehicle Data logger) from general vehicle is not available in the market. Only the vehicles' manufacturers have the tool to access the engine ECU to monitor the vehicle's data. Current available automotives meter are display an estimated but inaccurate speed, engine revolution, fuel and temperature data. This paper developed to record the speed of vehicle, engine revolution (rpm), engine temperature, fuel volume and distance parameters. The software design of ASIC has been developed to implement a sophisticated data viewing methods. The program utilized new framework based on track segmenting to better organize data, instantly provides summaries of segment data and attempts to better display the driving performance. The characteristic of fuel sender (to determine fuel volume in Litre) and temperature sender (to determine engine temperature in Celsius) is downloaded to FPGA. All the recorded information are saved in Application IC chip(8051 IC), which can be reset after load the information to the (personal computer) PC using UART communication. All the measurements were carried out on selected road track as the field test of total trips about 3 kilometers. Keywords – V-models, Data Logger, Bench Test, Vehicle Test, ASIC model. #### I. INTRODUCTION E.Larimore et al (2009) [2] present the identification and monitoring of automotive engines. The main objective is to extend and refine the nonlinear canonical variety analysis (NLCVA). The additional refinements are developed using general bases of non linear functions. The method of Leaps and Bounds with Akaike information criterion AIC are chosen. Delay estimation procedures are employed to consider the state order of the identified engine model and also reducing the number of estimated parameter that affects the identified model accuracy. The linear Gaussian system methods was applied to a 5.3L V8 engine to verify appropriate nonlinear basis functions, engine delay and decrease the model state from 16 to 7. Herpel et al (2009) [3] exposed a straightforward methodology how to perform prototype measurements on automotive CAN ECU communication and how to derive valuable information about controller-specific startup behavior. Logging and accessing important aspects of data transmission is done in both the early phase of system design by mean of simulation or analytical evaluation of the in-car communication system, and in terms of measurements in real test cars by logging and analyzing communication data in prototype installation. The results from measurement data analysis show the comparison of CAN communication startup durations between the available ECU's in the prototype CAR (German Car AUDI A6 Limousine 3.2 TDI). The advantages of those methods are the high speed CAN with real time data logging.z William Swihart and Jerry Woll, (1997) [6] have developed an integrated collision and vehicle information system for heavy machine. They have proposed a possible system integration path that would combine existing collision warning system (CWS) onboard computers (OBC) capability with next generation vehicle radar technology and emerging drowsy driver systems. The challenge to the systems integrator is to combine relevant data that provides utility to both the fleet operator and the driver without adding unnecessary system complexity and cost. #### II. METHODOLOGY The objective of this study is to measure and analyze parameters from the developed online monitoring system. The automotive online monitoring system is designed with microcontroller based for data logging purposes and requires no input from the driver of the vehicle. The automotive online monitoring system records a set of data for each journey made in the vehicle which is starts when the ignition is switched on and ends when the ignition is switched off. The V-model is applied for the software development process [8],[10]. The process steps are bent upwards after coding process to form the type V shape. #### A. Software Design and Development The V-model demonstrates the relationships between each phase of the development life cycle and its linked phase of testing. The horizontal and vertical axes represents time or project completeness (left-to-right) and level of abstraction (coarsest-grain abstraction uppermost), respectively. The model is illustrated in Figure 1. On the left side of V-model. Fig1 V-model #### A. Software Testing Technique The software testing technique is divided into two parts that are bench test for module and system test and vehicle test for real application test. The module test and system test are performed so that the error can be detected earlier and eliminated once the software development has finished. The module test will be performed at the module level development and system test will be performed at the system level development. The equipments used during the module and system test are oscilloscope, DC power supply, function generator and decade box. The test cases with the expected result are drafted before module and system test are performed. Before the vehicle test is conducted, all of sensor inputs on the wire hardness are verified to ensure the sensor input is correct connected to the online monitoring box. Oscilloscope and multimeter are used for the sensor measurement. The data collection from the input sensors of the online monitoring box is done in real time. All the collecting data were stored in PC memory. #### B. Hardware And Interfacing The earlier vehicles were implemented a Fig. 2 System architecture of the data logger conventional network topology. However disadvantage of this system is the undetectable sensor failure during the logging system. The data in the CAN BUS are still available in the network although the sensor is failed to provide an input. The integration of CAN network topology with conventional network topology are also valuable for middle class of vehicle. Most of current Malaysian Proton Car such as new Exora, Gen2, Persona and Satria Neo use this combination network topology. The conventional network topology was well-matched for the purposes of development of data logger for lower class Malaysian Perodua Kancil 660 and 850 cc car (above year 2002 manufacture). The Perodua Kancil was selected as a target vehicle because the engine is already using the electronic ECU as the instrument cluster. All sensors are integrated with ADC so that outputs are in digital form . These cars still use the mechanical type sensors to log the speed, rpm, fuel and temperature. So these types of vehicle are not practical for a simple and efficient electronic data logger. A 12V voltage supply is used to regulate a dc supply for microcontroller. The clock generates 8.00 MHz frequency to the Microcontroller. Four input that comes from difference sensor namely as Speed, Tacho, Fuel and Temperature. All sensors used are available in the target vehicle. The speed and tacho sensor will supply a fussy frequency that represents the value of speed and revolution per minutes (r.p.m). The fuel and temperature sensor will forethought the resistance values that indicate the current value of fuel left in the fuel tank and the recent engine temperature. #### III. PROPOSED METHODOLOGY This paper encompass into two major important part of online monitoring system that are hardware and software development. The hardware development describes about the FPGA circuit and circuit interface from vehicle to the online monitoring system. The software development explains the ASIC use in the software design till the verification, validation process. The online monitoring system is functioning to record the speed of vehicle, engine revolution (rpm), engine temperature, fuel volume and distance. The characteristic of Fuel sender (to determine fuel volume in Litre) and temperature sender (to determine engine temperature in Celsius) is dumped to FPGA, which can be reset after load the information to the PC using UART communication with Keyword 2000 protocol. Fig.3 Design Flow For UART IP #### ASIC DESIGN FLOW: This document briefly explains the usage of SOC Cadence Electronic Design Automation (EDA) Tool for the design of the Application Specific Integrated Circuit (ASIC) Flow ASIC Design Any IC other than a general purpose IC which contain the functionality of thousands of gates is usually called an ASIC (Application Specific Integrated Circuit). ASICs are designed to fit a certain application. An ASIC is a digital or mixed-signal circuit designed to meet specifications set by a specific project. The basic ASIC Design Flow is code to GDS II format is SOC Encounter. The details of the CADENCE Flow for the hierarchical design of large integrated circuits | Phase | Activities | | | |--------------|-------------------------|--|--| | | Requirement Analysis | | | | Verification | System Design | | | | Vermeation | Architecture Design | | | | | Module Design | | | | | Module Test | | | | Validation | Integration Test | | | | | System Test (Bench Test | | | | | and Vehicle Test) | | | Table.1 Software Development (ICs)and systems-on-chip Hierarchical design methodologies are typically adopted to handle very large designs or to support the concurrent design of a complex chip by a design team. Blast Plan Pro meets both of these requirements and delivers the additional benefit of predictable design closure. A USI is the microchip with programming that controls a computer's interface to its attached serial devices. Specifically, it provides the computer with the RS-232C Data Terminal Equipment (DTE) interface so that it can "talk" to and exchange data with modems and other serial devices. As part of this interface, the UART also. **UART Block Diagram** ### Fig. 4 UART Block Diagram Converts the bytes it receives from the computer along parallel circuits into a single serial bit stream for out bound Transmission. On inbound transmission, converts the serial bit stream into the bytes that the computer handles Adds a parity bit (if it's been selected) on outbound transmissions and checks the parity of incoming bytes (if selected) and discards the parity bit. Adds start and stop delineators on outbound and strips them from inbound transmissions. Handles interrupts from the keyboard and mouse (which are serial devices with special ports). May handle other kinds of interrupt and device management that require coordinating the computer's speed of operation with device speeds Serial transmission is commonly used with modems and for non-networked communication between computers, terminals and other devices. At first sight it would seem that a serial link must be inferior to a parallel one, because it can transmit less data on each clock tick. However, it is often the case that serial links can be clocked considerably faster than parallel links, and achieve a higher data rate. | Speedometer<br>Gauge | | Tachco Gauge<br>(Revolution) | | Fuel | | Temperature of<br>Engine Coolant | | |-----------------------|-------------------------|------------------------------|---------------------------|-------------------|-----------------------|----------------------------------|-----------------------| | Input<br>Freq<br>(Hz) | Spee<br>d<br>(km/<br>h) | Input<br>Freq<br>(Hz) | Revol<br>ution<br>(r.p.m) | Fuel<br>(Lit<br>) | Resis tant $(\Omega)$ | Te<br>mp<br>(°C) | Resis tant $(\Omega)$ | | 0 | 0 | 0 | 0 | 5 | 304 | 45 | 222.3 | | 14.16 | 20 | 25 | 1000 | 8 | 284 | 50 | 181.1 | | 28.31 | 40 | 50 | 2000 | 12.5 | 252 | 112 | 23.6 | | 42.47 | 60 | 75 | 3000 | 25 | 188 | 117 | 20.6 | | 56.62 | 80 | 100 | 4000 | 37.5 | 124 | ı | - | | 70.78 | 100 | 125 | 5000 | 45 | 77 | - | - | | 84.93 | 120 | 150 | 6000 | - | - | - | - | | 113 28 | 160 | 200 | 8000 | _ | _ | _ | _ | Table: 2 Measured Parameters (Bench Mark) Clock skew between different channels is not an pulses/km number [k-factor] programmed into the issue (for unclocked serial links)A serial connection requires fewer interconnecting cables (e.g. wires/fibres) and hence occupies less space. The extra space allows for better isolation of the channel from its surroundings Crosstalk is less of an issue, because there are fewer conductors in proximity. In many cases, serial is a better option because it is cheaper to implement. Many ICs have Serial interfaces, as opposed to parallel ones, so that they have fewer pins and are therefore cheaper. In telecommunications and computer science, serial communications is the process of sending data one bit at one time, sequentially, over a communications channel or computer bus. This is in contrast to parallel communications, where all the bits of each symbol are sent together. Serial communications is used for all long-haul communications and most computer networks, Speed=( f\*36000)/k-factor. k-factor is 2548 pulse/km and f is the input frequency. For the Tacho gauge, microcontroller measures the frequency of the pulses received on the input and drives the stepper motor to a position dependent on the frequency. There shall be no visible 'step' movement of the Tacho gauge. The tacho revolution is calculated as (2): Tacho=(f\*60)/m-factor Results and discution From the data speeds versus time as in Figure 4, it has exposed that the target vehicle was moving after 10 seconds the engine was started. But the oil pedal was pushed before that. The data RPM versus time as in Figure 5 can show clearer evidence at which second the driver start to push the oil pedal. Start from second 4 until second 9 it shows how drastically the RPM was increase from 1153 RPM to 2058 RPM. Normal RPM for engine without ramp (push the oil .pedal) is about 900 RPM to 1100 RPM with air condition on. The minimum speed was recorded at the second of 91, the speed was 2.8 KM/H at 1225 RPM and by the time the trip was about 0.6 KM. The maximum speed was achieved at the second of 237 the speed was 57.9 KM/H at the 3034 RPM. By the time the Fig .7 Fuel Volume (litres) versus time (seconds Fig. 8 Revolution versus time (seconds) trip was about 2.1 KM. The fuel volume was fluctuated due to the tank sender float was going up and down due to the road is bumpy. The car consumed around 0.02 Litre for 3 km. The Engine Fig.9 The wave of UART function simulation Coolant temperature was remains stable at ADC value 153. The Coolant Fan is activated when the engine temperature start to increase. This analysis is done manually and technically from the driving experience and technical knowledge of vehicle engine. This data also has been validated with an External GPS Navigator. #### IV. CONCLUSION The ASIC model technique is very helpful for the software development. The bug can be minimized or eliminated by performing the module and bench test. It is also can expedite the time of software development. The FPGA can support for other application likes UART, Serial communication likes Manchester code. The FPGA has reserved more memory space for expend the application. The UART communication is still can be applied for the data logging The Online monitoring data can save a lot of money and time to measure the data from the sensors. The behavior of the sensor input can be monitored by using the online monitoring data. The existence of the online monitoring system has an advantage for the vehicle consumers. Since the system could retrieve accurate signals such as the fuel volume with resolution 0.01 Litre and the travelling distance in meter with resolution 100m, actual speed in km/h with resolution 0.1 km/h and actual revolution. Using the online monitoring data the driver can also detect the defective sensor when the sensor input remain at certain value or zero during car moving. The driver could realize the fuel consumption in seconds and can optimized their driving manoeuvre. The retrieved data from the sensor can be studied and analyzed for further improvement of the vehicle. The project is proposed to upgrade the online monitoring data instead of using UART communication .The UART speed is 19.2 KBaud achieved until 1 expensive tool. The primary benefit of a 16850 USI is a 128-byte FIFO buffer that prevents data loss in high-speed serial communications. The optional 16950 USI provides the following benefits: 8x more baud rates due to 1/8th clock pre-scalar, 128-byte FIFO buffer, 9-bit protocol support and Isochronous mode #### REFERENCES - [1] Perez, A., Garcia, M.I., Nieto, M., Pedraza, J.L., Rodriguez, S., Zamorano, J, "Argos: An Advanced In-Vehicle Data Recorder on a Massively Sensorized Vehicle for Car Driver Behavior Experimentation", IEEE Transactions on Intelligent Transportation Systems, pp 463-473, 2010. - [2] Larimore, W.E., "Maximum likelihood subspace identification for linear, nonlinear, and closedloop systems", American Control Conference, pp 2305-2319, 2005. - [3] Herpel, T., Kloiber, B., German, R., Fey, S, "Assessing the CAN communication Startup Behavior of Automotive CAN ECUs by Prototype Measurement", In *Int Instrumentation and Measurement Technology Conference (12MTC)* Singapore, 2005. - [4] Zuraidi Saad, Noor Hafizi Norma, Azhar Efendi Md Saad, Fairul Nazmie Osman, Khairul Azman Ahmad, "Design And Development of Universal Data Logger For Testing Vehicle Performance", Conference On Scientific And Social Research (CSSR), Digest no: 570555, 2009. \*\*\* ### Comparasion of Energy Detection, Cyclic Prefix and Cooperative Detection Spectrum Sensing Techniques in Cognitive Radio #### Simar Buttar ECE, Lovely Professional University, Phagwara, Punjab, India E-mail: Simar buttar@yahoo.com Abstract - With the advance of wireless communications, the problem of bandwidth scarcity has become more prominent. Cognitive radio technology has come out as a way to solve this problem by allowing the unlicensed users to use the licensed bands opportunistically. To sense the existence of licensed users, many spectrum sensing techniques have been devised. In this paper, energy detection and cyclic prefix is used for spectrum sensing. The comparison of ROC curves has been done for various wireless fading channels using squaring and cubing operation, the improvement has gone as high as up to 0.6 times for AWGN channel and 0.4 times for Rayleigh channel as we go from squaring to cubing operation in an energy detector. In cyclic prefix, a special feature embedded in the OFDM signals is used for spectrum sensing. Closed form expressions for Probability of detection for AWGN and Rayleigh channels are described. Cooperation among the users is a valuable tool in the implementation of the spectrum sensing and it improves the performance of cyclic prefix based spectrum sensing up to 3.9 times as compared to the single user. Keywords - Spectrum Sensing, Cognitive Radio, Probability of detection, Cooperative Detection. #### I. INTRODUCTION Today, by unprecedented growth of wireless applications, the problem of spectrum scarce is becoming more and more apparent. Most of the spectrum has been allocated to specific users, while other spectrum bands that haven"t been assigned are overcrowded because of overuse. However, most of the allocated spectrum is idled in some times and locations. The Federal Communication Commission (FCC) research report [1] reveals that, seventy percent of the allocated spectrum is underutilized. So we need a technique to deal with the problem of spectrum underutilization, which makes the birth of cognitive radio. Cognitive radio [2][3]can sense external radio environment and learn from past experiences. It can access to unused spectrum band dynamically without affecting the primary users, in such a way to improve the spectrum efficiency. Sensing external radio environment quickly and accurately plays a key role in cognitive radio. Spectrum sensing includes the detection of primary users and secondary users in other cognitive networks in the same region, but most of papers on spectrum sensing only consider the detection of primary users. In this paper, we consider the cyclic prefix, a special feature embedded in the OFDM (Orthogonal Frequency Division Multiplexing) signals; is used to detect the presence of primary user"s signal and is considered to be better than energy detection and matched filter detection as it performs well even in the fading channels. In addition, cooperative detection is used among the secondary users to improve the performance of spectrum sensing. Energy detector based approach, also known as radiometry or periodogram, is one of the popular methods for spectrum sensing as it is of non-coherent type and has low implementation complexity. In addition, it is more generic as receivers do not require any prior knowledge about the primary user"s signal [4]. In this method, the received signal"s energy is measured and compared against a pre-defined threshold to determine the presence or absence of primary user"s signal. Moreover, energy detector is widely used in ultra wideband (UWB) communications to borrow an idle channel from licensed user. Detection probability (). False alarm probability () and missed detection probability () are the key measurement metrics that are used to analyze the performance of an energy detector. The performance of an energy detector is illustrated by the receiver operating characteristics (ROC) curve which is a plot of Pd versus Pf or Pm versus Pf [5]. This paper is organized as follows: Section 2 describes the OFDM (Orthogonal Frequency Division Multiplexing) System Model. Section 3 and 4 describe the expressions for probability of detection for AWGN (Additive White Gaussian Noise) and Rayleigh channels respectively. Simulation Results for Cyclic Prefix and energy detection Based Spectrum Sensing over AWGN (Additive White Gaussian Noise) and Rayleigh channels and improvement using cooperative detection are presented in section 5 followed by conclusions in section 6. #### II. OFDM SYSTEM MODEL Fig. 1. Simplified Block Diagram of OFDM Transmitter Consider a block of data symbols mapped on to the subcarriers is represented by: $$\{s(0), s(1), s(2), \dots, s(T_d - 1)\}\$$ The IFFT (Inverse Fast Fourier Transform) operation converts these frequency domain signals into timedomain signals and the time domain signals are represented by: $$\{x(0), x(1), x(2) \dots, x(T_d - 1)\}$$ where IFFT block size is assumed to be Td. Last Tc symbols of each block are added to the beginning of each block as cyclic prefix and the transmitted signal becomes: $$\{x(-T_c), ..., x(-1), x(0), x(1), ..., x(T_d - T_c), ..., x(T_d - 1)\}$$ where the block of symbols $\{x(-T_c), ..., x(-1)\}$ is an exact copy of $$\{x(T_d - T_e), \dots, x(T_d - 1)\}\ \text{i.e.}\ x(t) = x(T_d + t)_{\text{i.e.}}$$ where $t \in [-T_c, -1]$ Now, the relation between the signals before and after the IFFT block can be expressed by the following expression $$\chi(t) = \frac{1}{\sqrt{r_d}} \sum_{n=0}^{T_d-1} s(n) e^{\frac{(2\pi (t-T_c)n)}{T_d}} , t = 0,1, \dots T_d - 1$$ (1) A transmitted OFDM frame may contain several such blocks. Let denote the symbols of the transmitted OFDM frame. Detection is based on two hypotheses [5]: $$H_{\hat{0}}: \qquad r(t) = n(t) \tag{2}$$ and $$H_1$$ : $r(t) = y(t) + n(t)$ (3) where r(t) is the received signal, n(t) is the additive white Gaussian noise $[6]H_0$ represents the hypothesis when the signal is absent and only noise is present. $H_1$ represents the hypothesis when both signal and noise are present. Let $\chi$ is a measure of correlation between two samples distance Td apart [10]. $$\chi = \sum_{t=1}^{w} \frac{r(t)r^{*}(t+T_{d})}{E[[r(t)]^{2}]}$$ (4) For CP (Cyclic prefix) OFDM signal, the statistic $\chi$ under the above two hypothesis can be expressed as [10]: $$H_0: \chi = \sum_{t=1}^{W} \frac{n(t)n^*(t+T_d)}{F[(n(t))^2]}$$ (5) And $$H_1: \chi = \sum_{t=1}^{W} \frac{(y(t)+n(t))(y^*(t+T_d)+n^*(t+T_d))}{E[|y(t)+n(t)|^2]}$$ (6) #### III. PROBABILITY OF DETECTION A)Probability of Detection in Cyclic Prefix for AWGN Channel Probability of Detection for cyclic prefix based spectrum sensing method over AWGN channel can be expressed as [10],[5]: $$P_D = \frac{1}{2} erfc \left( \frac{\tau' - \mu_1}{\sqrt{W}} \right) \tag{7}$$ where Pd is the probability of detection for AWGN Channel, $\mu_1$ is the mean under Hypothesis $H_1$ , is the observation window size or number of samples, T'is the threshold and is given by : $$T' = \sqrt{\frac{w}{2}} erfc^{-1}(2P_F) \tag{8}$$ where $P_f$ is the false alarm probability for AWGN channel. Under Hypothesis H<sub>1</sub>, Mean µ<sub>1</sub> can be calculated $$\mu_1 = E[\chi | H_1] \qquad (9)$$ $$= E \left[ \sum_{t=1}^{W} \frac{(y(t)+n(t))(y^{*}(t+T_d)+n^{*}(t+T_d))}{E[|y(t)+n(t)|^2]} \right] (10)$$ $$= \sum_{t=1}^{W} \frac{\varepsilon[(y(\epsilon)(y^*(\epsilon+r_d))]}{\varepsilon[|y(t)+n(t)|^2]}$$ (11) Since $y(t)=y^*(t+T_d)$ only when y(t) falls into the cyclic prefix period, equation (11) can be expressed as: $$\mu_1 = \sum_{t=1}^{W} \frac{P(y(t) \in CP)E[|y(t)|^2]}{\sigma_{xx}^2 + \sigma_x^2}$$ (12) $$= \sum_{t=1}^{W} \frac{\tau_c}{\tau_{s+\tau}} \frac{\sigma_y^2}{\sigma_{s+\tau}^2 + \sigma_z^2}$$ (13) $$= \sum_{t=1}^{W} K \frac{\sigma_y^2}{\sigma_y^2 + \sigma_n^2}$$ $$= \frac{\kappa wy}{1+y}$$ (14) $$=\frac{kW\gamma}{1+\gamma}\tag{15}$$ where $$K = \frac{T_c}{T_{d+T_c}}$$ and $$\gamma = \frac{\sigma_y^2}{\sigma_n^2}$$ is signal to noise ratio. #### B) Probability of Detection in Cyclic Prefix for Rayleigh Channel Probability of Detection for cyclic prefix based spectrum sensing method over rayleigh channel can be expressed as [11]: $$P_{D,Ray} = \int_{0}^{\infty} P_{D} f(\gamma) d\gamma \qquad (16)$$ where Pd is the probability of detection for AWGN channel and $f(\gamma)$ is the probability density function for Rayleigh channel [12, Eq.(4-44)]. $$f(\gamma) = \frac{1}{\overline{\nu}} \exp \left( \frac{-\gamma}{\overline{\nu}} \right)$$ (17) Using (7), (15) and (17), equation (16) can be rewritten as: $$P_{D,Ray=\frac{1}{2\overline{\gamma}}} \int_{0}^{\infty} erfc \left( \frac{\tau' \frac{KW\gamma}{1+\gamma}}{\sqrt{W}} \right) \exp\left( \frac{-\gamma}{\overline{\gamma}} \right) d\gamma \qquad (18)$$ Now considering the following notations: $$\gamma = \frac{t}{1-t}, d\gamma = \frac{dt}{(1-t)^2}, \frac{\tau'}{\sqrt{W}} = a, K\sqrt{W} = b$$ Using these notations, equation (18) can be rewritten as: $$P_{D,Ray} = \frac{1}{2\overline{\gamma}} \int_{0}^{1} erfc(-bt) \exp\left(\frac{-t}{\overline{\gamma}(1-t)}\right) \frac{dt}{(1-t)^{2}}$$ (19) Taking the assumption t<<1 and applying Binomial approximation, we have: $$P_{D,Roy} = \frac{1}{2\overline{y}} \int_0^1 erf c(a - bt) \exp\left(\frac{-t}{\overline{y}(1-t)}\right) (1 + 2t) dt$$ (20) Of, $$P_{D,Ray} = \frac{1}{2\overline{\gamma}} \int_0^1 erfc(a - bt) \exp\left(\frac{1}{\overline{\gamma}} - \frac{1}{\overline{\gamma}(1-t)}\right) (1 + 2t) dt$$ (21) 01, $$P_{D,Ray} = \frac{1}{2\overline{\gamma}} \int_0^1 erfc(a - bt) \exp\left(\frac{1}{\overline{\gamma}} - \frac{1}{\overline{\gamma}} (1 + t)\right) (1 + 2t) dt$$ (22) 01, $$P_{D,Ray} = \frac{1}{2\overline{\gamma}} \int_{0}^{1} erfc(a - bt) \exp\left(\frac{-t}{\overline{\gamma}}\right) (1 + 2t) dt$$ (23) Solving it using mathematica, we get the approximated expression for probability of detection for Rayleigh channel as: $$\begin{split} P_{\mathcal{D}, Ray} & \cong \frac{1}{2\bar{\gamma}b^2} \left| e^{-\frac{a}{b\bar{\gamma}}} \left[ \left( -1 + 2ab\bar{\gamma} + b^2\bar{\gamma}(1 + 2\bar{\gamma}) \right) e^{-\frac{1}{4b^2\bar{\gamma}^2}} erf\left(a - \frac{1}{2b\bar{\gamma}}\right) + \frac{1}{\sqrt{\pi}} \left[ e^{-\frac{1+a^2\bar{\gamma} + b^2\bar{\gamma}}{\bar{\gamma}}} \left[ -\left( -1 + 2ab\bar{\gamma} + b^2\bar{\gamma}(1 + 2\bar{\gamma}) \right) e^{a^2 + b^2 + \frac{1}{4b^2\bar{\gamma}^2} + \frac{1}{\bar{\gamma}}} \sqrt{\pi} erf\left(a - b - \frac{1}{2b\bar{\gamma}}\right) + b\bar{\gamma}e^{\frac{a}{b\bar{\gamma}}} \left[ -2e^{2ab} + 2e^{b^2 + \frac{1}{\bar{\gamma}}} + b(1 + \bar{\gamma}) \right] \end{split}$$ $$2\bar{\gamma})e^{a^{2}+b^{2}+\frac{1}{\bar{\gamma}}}\sqrt{\pi}erfc(a) - b(3+2\bar{\gamma})e^{a^{2}+b^{2}}\sqrt{\pi}erfc(a - b)$$ $$b)$$ (24) # IV.PROBABILITY OF DETECTION AND FALSE ALARM IN ENERGY DETECTION A) In AWGN Channel Probability of detection Pd and false alarm Pf can be evaluated respectively by [11]: $$P_d = P(Y' > \Lambda | H_1)$$ $$P_f = P(Y' > \Lambda | H_0)$$ where $\lambda$ is the decision threshold. Also, can be written in terms of probability density function as $$P_f = \int_{\Lambda}^{\infty} f_{Y'}(y) \, dy$$ $$P_f = \frac{1}{2^{d} \Gamma(d)} \int_{\Lambda}^{\infty} y^{d-1} e^{-(\frac{y}{2})} dy$$ Dividing and multiplying the R.H.S. of above equation by $2^{d-1}$ we get $$P_f = \frac{1}{2\Gamma(d)} \int_A^{\infty} \left(\frac{y}{2}\right)^{d-1} e^{-\left(\frac{y}{2}\right)} dy$$ Substituting $\frac{y}{2} = T$ , $\frac{dy}{2} = dt$ and changing the limits of integration to, we get $$P_f = \frac{1}{r(d)} \int_{A/2}^{\infty} (t)^{d-1} e^{-(t)} dt$$ $$P_f = \frac{\Gamma(d,\Lambda/2)}{\Gamma(d)}$$ where $\Gamma$ (.) is the incomplete gamma function [13]. Now, Probability of detection can be written by making use of the cumulative distribution function $$P_d = 1 - F_{v'}(\Lambda)$$ The cumulative distribution function (CDF) of can be obtained (for an even number of degrees of freedom which is in our case) as $$F_{\gamma'}(y) = 1 - Q_d(\sqrt{\lambda}, \sqrt{y})$$ $$P_d = Q_d(\sqrt{\lambda}, \sqrt{\Lambda})$$ $$P_d = Q_d(\sqrt{2\gamma}, \sqrt{\Lambda})$$ B) In Rayleigh Channel Probability density function for Rayleigh channel is $$f(\gamma) = \frac{1}{7} exp\left(\frac{-\gamma}{\gamma}\right) \qquad \gamma \ge 0$$ The Probability of detection for Rayleigh Channels is obtained by averaging their probability density function over probability of detection for AWGN Channel $$P_{d,R} = \int_0^\infty P_d f(\gamma) d\gamma$$ where Pd,r is the probability of detection for Rayleigh $$P_{d,R} = \frac{1}{\bar{\gamma}} \int_{0}^{\infty} Q_{d}(\sqrt{2\gamma}, \sqrt{\Lambda}) \exp\left(\frac{-\gamma}{\bar{\gamma}}\right) d\gamma$$ Now, substituting $\sqrt{\gamma} = x$ , $\gamma = x^2$ , $d\gamma = 2xdx$ $$P_{d,R} = \frac{2}{\bar{r}} \int_{0}^{\infty} x \cdot Q_{d}(\sqrt{2}x, \sqrt{\Lambda}) \exp\left(\frac{-x^{2}}{\bar{r}}\right) dx$$ Probability of detection for Rayleigh channel can be expressed as $$P_{d,R} = e^{(-4/2)} \sum_{n=0}^{d-2} \frac{1}{n!} \left[ \frac{A}{2} \right]^n + \left( \frac{1+\gamma}{\gamma} \right)^{d-1} \left[ \exp\left( -\frac{A}{2(1+\gamma)} \right) - \exp\left( -\frac{A}{2} \right) \sum_{n=0}^{d-2} \frac{1}{n!} \left( \frac{A\gamma}{2(1+\gamma)} \right)^n \right]$$ #### V. SIMULATION RESULTS The performance of energy detector is analysed using ROC (Receiver operating characteristics) curves for fading channels. Monte-Carlo method is used for simulation. It can be seen in the following figures that with increase in SNR (Signal to Noise Ratio), the performance of energy detection improves. FIGURE 2 and FIGURE 4 illustrates the ROC curves using squaring operation for AWGN and Rayleigh channel respectively. FIGURE 3 and FIGURE 5 depicts improvement in the performance of energy detector using cubing operation over AWGN and Rayleigh channel respectively. We assume time-bandwidth product=5. FIGURE 2: Complementary ROC Curves for AWGN using Squaring operation FIGURE 4 : Complementary ROC for Rayleigh using Squaring operation. Fig. 4. Probability of detection versus Probability of false alarm for AWGN Channel Fig. 5. Comparison of plots for Probability of detection versus signal to noise ratio (SNR) over AWGN and Rayleigh Channel. #### IV. CONCLUSION: In the present work, performance of cyclic prefix and energy detection based spectrum sensing is analysed. Closed form expressions for probability of detection for AWGN and Rayleigh channels are described. Using ROC (Receiver Operating Characteristics) Curve, it has been shown that cooperative detection improves the performance of cyclic prefix based spectrum sensing method as high as up to 3.99 times compared to single user detection. #### ACKNOWLEDGEMENT Words are often too less to reveal one"s deep regards. An understanding of the work like this is never the outcome of the efforts of a single person. I take this opportunity to express my profound sense of gratitude and respect to all those who helped me in this duration of dissertation. First of all, I would like to thank the supreme power "the all mighty god" and my parents who has always guided me to work on the right path of the life. Without their grace this would never turn into reality. I would like to express my deep sense of gratitude toward my guide Ms. Komal Arora, Assistant Professor, Lovely Professional University, Phagwara who provided me all facilities and resources required for this work. I would also like to thank all the faculty of the department and my few best friends who helped me in this work directly or indirectly. #### REFERENCES - [1] Ma, G. Y. Li, B.H. Juang. "Signal Processing in Cognitive Radio." Proceedings of the IEEE, vol. 97, pp. 805-823, May 2009. - [2] H. A. Mahmoud, T. Yucek and H. Arslan. "OFDM For Cognitive Radio- Merits and Challenges." IEEE wireless communications, vol. 16, pp. 6-15, April 2009. - [3] I. F. Akyildiz, W.Y.Lee, M.C. Vuran, S. Mohanty. "Next Generation/Dynamic Spectrum Access/Cognitive Radio Wireless Networks: A Survey." Comp. Net. J., vol. 50, pp. 2127–59, Sept. 2006. - [4] H. Urkowitz. "Energy detection of unknown deterministic signals." Proc. IEEE, vol. 55, pp. 523–531, April 1967. - [5] S. Atapattu, C. Tellambura, and H. Jiang. "Energy detection of primary signals over η-μ fading channels." in Proc. Fourth International Conference on Industrial and Information Systems, ICIIS, 2009, pp. 118-122. - [6] L. Yu, L.B Milstein, J.G Proakis, B.D. Rao, S.P. Bingulac, "Performance Degradation Due to MAI in OFDMA Based Cognitive Radio," IEEE International Conference on Communications (ICC), pp. 1-5, May 2010, doi:10.1109/ICC.2010.5501830. - [7] Z. L. Chin, F, "OFDM Signal Sensing for Cognitive Radios," Proc. IEEE Symp. Personal, Indoor and Mobile Radio Communications (PIMRC '08)), pp. 1-5, Sept. 2008, doi:10.1109/PIMRC. 2008. 4699404. - [8] J. Mitola and G. Q. Maguire, "Cognitive Radio: Making Software Radios More Personal," IEEE Personal Communications, vol. 6, no. 4, pp. 13-18, Aug 1999, doi:10.1109/98.788210. - [9] Goh, L. P. Lei, Z. Chin, Francois, "Feature Detector for DVB-T Signal in Multipath Fading Channel," Proc. Second International Conference on Cognitive Radio Oriented Wireless Networks and communications CrownCom200,pp.234-240,2007,doi:10.1109/CROWNCOM.2007.4549 802. ### Mathematical modeling and Simulation analysis of PEM Fuel Cell #### Saurabh Srivastava & A Shaija Department of Mechanical Engineering, National Institute of Technology Calicut E-mail: ssaurabhs11@gmail.com, shaija@nitc.ac.in Abstract - In this paper we study the basic principle of PEM fuel cell, basic equation involved and the type of fuel cell, then the paper mainly dealing with the Mathematical modeling of the PEM FUEL CELL, and the performance characteristic of the Fuel Cell on voltage basis with the help of the matlab (SIMULINK). In this paper we study the steady state model of fuel cell. Result showed the polarization curved of the PEM Fuel cell and various losses in terms of performance of the fuel cell. Keywords- Proton exchange membrane, Modeling, Fuel Cell. #### I. INTRODUCTION A fuel cell is an electrochemical device that continuously and directly converts the Chemical energy of externally supplied fuel and oxidant to electrical energy. The idea of the gaseous fuel cell can be traced back to Sir William Grove, a Welsh judge, inventor, and physicist, who is recognized as "the father of the fuel cell." He believed that if $H_2$ and $O_2$ can be made by electrolysis of water, the reverse also must be possible. In 1842, Grove developed a stack of 50 fuel cells, which he called a "gaseous voltaic battery". #### II. PRINCIPLE OF FUEL CELLS Fuel cell converts the chemical energy of a fuel and an oxidant into electricity. Key part of a PEM fuel cell-the electrodes, catalyst and membrane together form the Membrane Electrode Assembly (MEA). Hydrogen ions (protons) passes through the membrane and the electrons will pass through the external circuit or load. Protons from the membrane electron form the external circuit and the oxygen from the air reacts at cathode and form water. #### A. Basic Reaction in fuel cell A PEM fuel cell consists of an electrolyte sandwiched between two electrodes. At the surfaces of the two electrodes, two electrochemical reactions take place. Anode Reaction Cathode Reaction Overall Reaction #### B. Types of fuel cells Fuel cells are customarily classified according to the electrolyte employed. - ➤ Phosphoric Acid Fuel Cell (PAFC) Electrolyte used as Phosphoric acid, mobile ions is H<sup>+</sup> and temperature is 160-220°C - ➤ **Solid Oxide Fuel Cell (SOFC)** Electrolyte used as Zirconium/Yttrium Oxide, mobile ions is O and temperature is 800-1200°C - ➤ Molten Carbonate Fuel Cell (MCFC) Electrolyte used as Lithium/ Sodium Carbonates, mobile ions is CO₃ and temperature is 600-850°C - ➤ Alkaline Fuel Cell (AFC) Electrolyte used as Potassium hydroxide(KOH), mobile ions is OH and temperature is 60-95°C - ➤ Proton Exchange Membrane Fuel Cell (PEMFC) Electrolyte used as Proton Exchange Membrane (PEM), mobile ions is H<sup>+</sup> and temperature is 60-80°C The popularity of PEMFCs, a relatively new type of fuel cell, is rapidly outpacing that of the others. We are also utilizing the PEM fuel cell for our study. Our study is based on the NEXA BALLARD 1.2 kW PEM Fuel Cell. #### III PEM FUEL CELL SYSTEM In addition to the Fuel cell stack, the system consists of three major subsystems: - ➤ The hydrogen supply subsystem, - ➤ The air compression subsystem - > The cooling subsystem. - A. Hydrogen supply subsystem Hydrogen supply to the fuel cell comprised of the following components - Compressed hydrogen cylinder which has maximum pressure capacity of 200 bar. - Three metal hydrides cylinder, each having 760 std liter capacities. - ➤ A pressure transducer monitors fuel delivery conditions to ensure an adequate fuel supply system operation. - A pressure relief valve protects downstream components from over-pressure conditions. A solenoid valve provides isolation from the fuel supply during shut down. - ➤ A pressure regulator maintains appropriate hydrogen supply pressure to the fuel cells. A hydrogen leak detector monitors for hydrogen levels near the fuel delivery subassembly. #### B. Air compression subsystem The air compression subsystem provides oxygen, in the form of air, to the fuel cell and it is comprised of following components: - > Air compressor - ➤ Humidifier/ heat exchanger - A downstream sensor used to measure air mass flow rate - C. Cooling subsystem - The purpose of the cooling subsystem is to remove the heat generated by the exothermic reaction of hydrogen and oxygen. - > The cooling subsystem consists of fan which is located on the base of the fuel cell. #### IV MATHEMATICAL MODELING Mathematical modelings of the four basic parts (humidifier anode, cathode and membrane) of the Fuel cell are as shown is fig. #### A. Modeling of Humidifier Change in air humidity due to additional injected water is called humidifier. Assuming the temperature of the flow is constant $T_{humidifier} = T_{cooler}$ $p_{\nu, humidifer}$ will use for the finding out the exit flow relative humidity imidifier 1 cooler Assuming Cooler air entering temperature is equal to the stack temperature and the pressure drop in the cooler is neglected then relative humidity of gas coming out from the cooler Total pressure Humidifier exit flow rate Now, Now, total pressure of the humid air, Specific humidity, by definition B. Modeling of anode Hydrogen partial pressure and anode flow humidity are determined by balancing by mass of hydrogen and water in anode By the mass conservation law, Hydrogen mass flow rate Total mass flow rate Vapour flow rate With that we can find out the partial pressure of hydrogen and anode flow humidity Now mass flow rate of dry air remains same for the inlet and outlet of the humidifier, Vapour flow rate increases by amount of water injected Vapour pressure will also change #### C. Modeling of cathode All gases obeys ideal gas law. The temperature of air inside the cathode is equal to the stack temperature. The properties of flow exiting the cathode such as temperature pressure are assumed to be the same as inside the cathode Now humidity ratio M<sub>a</sub> is calculated from equation Now flow rate for dry air and vapor Now oxygen and nitrogen flow rate are By the mass conservation law, Oxygen mass flow rate $x_{O2}$ can found out by the equation Nitrogen mass flow rate Vapor flow rate With these equations the Oxygen and Nitrogen mass flow rates could be found out. D. Membrane Hydration model It captures the water transport across the membranes. Water content and mass flow are assumed to be uniform over the surface area of the membrane. The water transport across the membrane is achieved through two distinct phenomena 1. Electro-osmotic drag phenomenon Back diffusion of water from cathode to anode Amount of water transported in proportional to the electro-osmotic drag coefficient n<sub>d</sub> which is defined as the number of water carried by each proton. Gradient of water concentration across membrane results in back diffusion of water. Now combining the two water transported mechanism the water flow across the membrane is By the electrochemistry principles, we can find out the mass of oxygen reacted and water production from the stack current $I_{\rm st}$ . By applying faraday law For Water production, Water flow rate across the membrane $(m_{v,membr})$ will calculate from the membrane hydration model By using thermodynamic properties, the mass flow rate for the individual species can be calculated as: For ideal gas vapor pressure Now dry air pressure is Coefficients nd and Dw are varying with the membrane water content λm, which is calculated from the average of water content at anode(λan) and cathode (λca). λan and λca can be calculated from membrane water activity. #### B. OHMIC LOSS Vohm The ohmic voltage rises from the resistance of the polymer membrane to the transfer of protons and the resistance of the electrode and collector plates to the transfer of electrons. #### C. CONCENTRATION LOSS $V_{CON}$ The concentration over-voltage is caused by a limited transportation velocity of reactants to the electrode. $$\begin{split} &\lambda_m = 2 \text{ ,} D_{\lambda} = 10^{-6} \\ &2 \leq \lambda_m < 3 \text{ ,} D_{\lambda} = 10^{-6} (1 + 2(\lambda_m - 2)) \\ &3 \leq \lambda_m < 4.5 \text{ ,} D_{\lambda} = 10^{-6} (3 - 1.67(\lambda_m - 3)) \\ &\lambda_m > 4.5 \text{ ,} D_{\lambda} = 1.25 * 10^{-6} \end{split}$$ where B = 0.016, I is the current ilim is limiting current, $\Box$ conc is the form of gain voltage The fuel cell voltage is calculated using a combination of physical and empirical relationships. $$V = E-Vact-Vohm-Vconc$$ Also we can write as $$V = E + \Box act + \Box ohm + \Box conc$$ Hence, $m_{v,mambr}$ is calculated which can be used in the cathode flow model for vapor flow rate calculation. #### V FUEL CELL PERFORMANCE - ➤ The cell voltage of a PEM fuel cell can be represented by a polarization curve, which can be obtained from the open circuit voltage E by substituting some polarization losses. - > Three type of polarization losses are: - 1. Activation loss, V<sub>act</sub> - 2. Ohmic loss, V<sub>ohm</sub> - 3. Concentration loss, V<sub>con</sub> #### A. ACTIVATION LOSS Vact - On the catalysts surfaces the bond in the hydrogen and oxygen molecules were broken and new bonds were formed to produce water. - This breaking and forming of bond needs energy and causes a voltage drop. ## VI. SIMULINK MODEL OF FUEL CELL MODEL EQUATION $\Box$ act is the form of gain voltage, CO2 is the oxygen concentration. #### VII.RESULTS AND DISSCUSION Values used for the Simulink model for NEXA BALLARD System. Current = 0 to 35 A Temp constant = 310 K Oxygen at atmospheric condition Area of fuel cell = $122 \text{ cm}^2$ Limiting current =75 A Number of cells = 50. Voltage is Decreases as the current increases; initially the voltage drop is more due to the activation losses, then its decreases slightly due to the ohmic losses and lastly slight down due to the concentration losses. #### A. Polarization curve (using MATLAB) #### B. Power Vs current Power will increase when current will increase up to a limit power will start slightly decreasing. Fig. C and D are the Steady state simulink model for voltage and power with respect to time, as the time changes voltage decreased, followed by all the losses. #### C. Steady state voltage of fuel cell (using matlab) #### D. Steady state power of fuel cell (using MATLAB) #### VIII. CONCLUSION In this paper we studied the basic about the Fuel Cell, type of fuel cell. as our work is depend on the on the PEM Fuel cell. Here we did the mathematical modeling of the various part of fuel cell mainly the humidifier, anode cathode and the membrane hydration model. Then we studied about the fuel cell performance model which included the various losses of the fuel cell. Studied the Performance model of fuel cell Using MATLAB (SIMULINK). Draw the Polarization curve (Voltage Vs Current). And the Power Vs Current curve. #### NOMENCLATURE ohm-Ohmic act- Activation conc- Concentration gen - Generated int - internal #### **SUBSCRIPTS** - a air - v Vapour - fc Fuel cell - an anode - co cathode - M Molar mass - i Current - V Voltage - T Temperature - m Mass flow rate - x- mole fraction - p pressure - F- Faraday Constant - n Number of moles - $\lambda$ water content - $\Phi$ Relative Humidity - c<sub>v</sub> water concentration at membrane surface - t thickness - ρ density #### REFERENCES - Larminie J, Dicks A. Fuel Cell Systems Explained. Second Ed. John Wiley & Sons Ltd; 2003. - [2]. Xiao-Zi Yuan and Haijiang Wang, PEM Fuel Cell Fundamentals. - [3]. Xianguo Li "Principles Of Fuel Cells" Taylor And Francis Group, 2006. - [4]. Mench M. Fuel Cell Engines. John Wiley & Sons Inc; 2008. Isbn 978-0-471-68958-4. - [5]. Yancheng Xiao And Kodjo Agbossou, Interface Design And Software Development For Pem Fuel Cell Modelling Based On Matlab/Simulink Environment, 2009 - [6]. Jay T. Pukrushpan et al, Control-Oriented Modeling And Analysis For Automotive Fuel Cell Systems University of Michigan Ann Arbor, Michigan. - [7]. 1.2 kW NEXA BALLARD PEM FUEL CELL MANUAL. - [8]. T.E. Springer, T.A. Zawodzinski and S. Gottesfeld, Polymer Electrolyte Fuel Cell Model, Journal of Electrochemical Society, v.138, n.8, pp.2334-2342, 1991. - [9]. W. Turner, M. Parten, D. Vines, J. Jones and T. Maxwell, Modeling a PEM fuel cell for use in a hybrid electric vehicle, Proceedings of the 1999 IEEE 49th Vehicular Technology Conference, v.2, pp.1385-1388, 1999. - [10]. Chiara Boccaletti, Gerardo Duni, Gianluca Fabbri, Ezio Santini, Simulation Models of Fuel Cell Systems july 2011 2006. - [11]. J.J. Baschuk, Xianguo Li, Modelling of polymer electrolyte membrane fuel cells with variable degrees of water flooding. # Optimization of DFCW Codes for MIMO Radar using Digitized Ambiguity Function #### Savani Tarang Kantilal, M. Uttara Kumari & B. Roja Reddy R. V. College of Engineering, Bangalore, India E-mail: tarangsavani@gmail.com, uttarakumari@rvce.edu.in & rojareddyb@rvce.edu.in Abstract - Multiple-Input Multiple-Output (MIMO) radar is an emerging technology that is attracting the attention of researchers. Unlike the traditional Single-Input Multiple-Output (SIMO) radar which emits coherent waveforms to form a focused beam, the MIMO radar can transmit via its antennas multiple probing signals which are orthogonal (or incoherent) waveforms. These waveforms can be used to increase spatial resolution of the MIMO radar system. The waveforms also affect the range and Doppler resolution which can be characterized by the Ambiguity Function (AF). In the research on MIMO radar, the optimal orthogonal waveform design is a crucial problem. In this paper, we propose a new method to obtain near optical frequency hopping waveform set with low side lobes in autocorrelations and cross correlations by optimizing a cost function constructed based on digitized ambiguity function. Some of the designed results are presented, and it has been observed that their correlation properties are better than other known in the literature. The simulation results and comparisons prove that the proposed algorithm is more effective for the design of DFCWs with superior aperiodic correlation. Keywords- MIMO Radar, Ambiguity Function, Waveform design, Cost Function, DFCW. #### I. INTRODUCTION Multiple Input Multiple Output (MIMO) operation has advanced communication significantly in the past two decades. Recently the idea has been introduced to radar to improve system performance through diversity. MIMO radar is supposed to transmit multiple independent waveforms on transmit end so that receivers can separate them and thereby achieve more degrees of freedom in signal processing. In this kind of MIMO radar, waveforms are required to have good side lobe performance in both Autocorrelation Functions (ACFs) and Cross-correlation Functions (CCFs). Orthogonal waveform design is therefore an important topic in MIMO radar study. The main idea in these papers is to reduce the side lobe levels in both the ACFs and CCFs by optimization method. The choice of transmitter waveforms plays an important role in the resolution characteristics of radar. In the traditional SIMO radar system, the radar receiver uses a matched filter to extract the target signal from thermal noise. Consequently, the resolution of the radar system is determined by the response to a point target in the matched filter output. Such a response can be characterized by a function called the ambiguity function [6]. San Antonio, et al. [2] has extended the radar ambiguity function to the MIMO radar case. It turns out that the radar waveforms affect not only the range and Doppler resolution but also the angular resolution. MIMO radar waveform design problem is to choose a set of waveforms which provides a desirable MIMO ambiguity function. The pulse waveforms generated by frequency hopping codes are considered in this paper. These pulses have the advantage of constant modulus. Recently, Chen and Vaidyanathan [4] have dealt with the design of MIMO frequency-hopping codes based on the optimization of a newly formulated MIMO ambiguity function (derived from [2]). Their design approach is to first parameterize these waveforms and then apply simulated annealing to find a near-optimal set of parameters using a 'cost function' that allows the comparison of different parameter sets. The hit-array was first studied in [5] as an analysis tool for frequency-hopping codes in the SIMO radar context. The hit-array [5] corresponds to a digitized version of the ambiguity function which is relatively simple to compute. In this paper, we generalize the hit-array to the hit-matrix, which is applicable to frequency-hopping codes for MIMO radar. We propose using numerical optimization techniques to search for the best frequency-hopping codes to obtain good system resolutions, using a measure of hit-matrix quality as the cost function to be minimized. In Section II, we present our MIMO radar model and describe frequency-hopping waveforms. We discuss the different DFCWs in Section III. In Section IV, we formulate the hit-matrix and calculate a cost function based on the hit-matrix and describe how we apply simulated annealing to find good codes. Simulation results are provided in Section V and Section VI concludes the paper. #### II. SYSTEM MODEL #### A. MIMO Radar Model Consider a monostatic MIMO radar that contains M transmitters and N receivers with their antennas configured as uniform linear arrays, as shown in Fig. 1. We assume a point target and also that the target, transmitters and receivers lie in the same 2-D plane. Let $d_T$ and $d_R$ represent the spacing between consecutive transmitters and receivers respectively, and let $\gamma = d_T / d_R$ . We define the spatial frequency of the target as $$f = \frac{d_R \sin(\theta)}{\lambda} \tag{1}$$ where $\theta$ is the target angle with respect to the broadside direction and $\lambda$ is the wavelength of the RF carrier of the transmitted waveforms. Let $\tau$ and $\nu$ be the target delay (which is a measure of target range) and Doppler frequency (a measure of target velocity), respectively. The spatial frequency f corresponds to the angular location of the target with respect to the arrays of the radar. Let $\{u_m(t)\}$ , $m \square \{0, \ldots, M-1\}$ represent the M transmitter waveforms. Then, the waveform received at the $n^{th}$ receiver antenna can be expressed as [5] $$y_n(t)|_{\tau,\nu,f} \approx \sum_{m=0}^{M-1} u_m(t-\tau) e^{j2\pi\nu t} e^{j2\pi f(\gamma m+n)}$$ (2) Fig. 1: Transmitters and Receivers in MIMO radar #### B. Frequency-hopping waveforms Frequency-hopping signals are good candidates for the radar waveforms because they are easily generated and have constant modulus. The waveforms can be represented as (see Fig. 2) $$u_m(t) = \sum_{l=0}^{L-1} \phi_m(t - T_l)$$ (3) where $$\phi_m(t) = \sum_{q=0}^{Q-1} e^{j2\pi c_{m,q}\Delta ft} s(t - q\Delta t) \qquad (4)$$ $$s(t) = \begin{cases} 1 & if \ 0 < t < \Delta t \\ 0 & otherwise \end{cases}$$ (5) Here, $c_{m,q}$ is the $(m,\ q)^{th}$ element of the matrix $[C]_{M\times Q}$ and it can assume values from the set $\{1,\ldots,K\}$ , where K is the total number of frequency hops available. Fig. 2: The structure of frequency-hopping waveforms As shown in Fig. 2, each transmitter waveform $u_m(t)$ consists of a stream of L identical pulses $\phi_m(t)$ . Each pulse in turn contains Q constant amplitude frequency subpulses each having width $\Delta t$ , and frequency $c_{m,q}\Delta f$ . Additionally, we impose the following conditions for orthogonality [4]. $$\Delta f \Delta t = 1 \tag{6}$$ $$c_{m,q} \neq c_{m',q} , \forall m \neq m', \forall q$$ Orthogonal waveforms result in a uniform beam pattern in all directions, which is a key aspect of detection using MIMO radars. For fixed $\Delta t$ and $\Delta f$ and pulse spacing $(T_0, T_1, \ldots, T_{L-1})$ , these waveforms can be completely described by the code matrix $$C = [c_{m,a}]_{M \times O} \tag{7}$$ ## III. DISCRETE FREQUENCY CODING WAVEFORMS In this section, the different Discrete Frequency Coding Waveforms are illustrated. #### A. DFCW-FF The DFCW-FF set consisting of L frequency-hopping waveforms can be represented as $$s_{l-FF}(t) = \sum_{n=0}^{N-1} p_{n-FF}^{l}(t - nT), \quad l = 1, 2, ..., L$$ (8) where $$p_{n-FF}^{l}(t) = \begin{cases} e^{j2\pi f_n^l t}, & 0 \le t \le T \\ 0, & elsewhere \end{cases}$$ (9) T is the time duration of subpulse, N is the number of contiguous subpulses, $f_n^l = n\Delta f$ is the coding frequency of subpulse n of waveform l in the DFCW-FF set, and $\Delta f$ is the frequency step. A coding frequency sequence $\{f_n\} = \{n_1\Delta f, n_2\Delta f, n_3\Delta f, \dots, n_N\Delta f\}$ can be represented with the coefficient sequence $\{n_1, n_2, n_3, \dots, n_N\}$ , which represents the firing order of frequency and is a unique permutation of sequence $\{0, 1, 2, \dots, N-1\}$ . #### B. DFCW-LFM By adding LFM to the DFCW-FF set, we get DFCW-LFM set as $$s_{l-LFM}(t) = \sum_{n=0}^{N-1} p_{n-FF}^{l}(t-nT). e^{jk\pi t^{2}}, \quad l = 1,2,...,L$$ (10) where k is the frequency slope, related to the bandwidth of the signal pulse B = kT. The delay-Doppler AF for waveform l in the DFCW-FF set is defined as $$\chi_{l-FF}(\tau,\xi) = \frac{1}{NT} \int_{-\infty}^{\infty} s_{l-FF}^{*}(t) s_{l-FF}(t-\tau) e^{j2\pi\xi t} dt$$ (11) The cross-AF of $p_{n-FF}^l(t)$ and $p_{m-FF}^l(t)$ for waveform l can be defined as $$\phi_{nm-FF}^{l}(\tau,\xi) = \frac{1}{\tau} \int_{-\infty}^{\infty} p_{n-FF}^{l*}(t) p_{m-FF}^{l}(t-\tau) e^{j2\pi\xi t} dt$$ (12) In DFCW\_LFM, adding LFM to a DFCW\_FF signal modifies its AF according to a simple rule [6] $$\phi_{nm-LFM}(\tau,\xi) = \phi_{nm-FF}(\tau,\xi + k\tau) \tag{13}$$ #### IV. OPTIMIZATION OF DFCW First we derive the hit-matrix for the code matrix C and calculate the cost function $g_p(c)$ based on that hit-matrix. Then we apply simulated annealing using $g_p(c)$ to find good optimized frequency hopping codes. #### A. The Hit-Matrix The hit-array has been introduced as a tool to analyze frequency-hopping waveforms in [5]. The central concept in this formulation is that of a "hit", which occurs when the received pattern has been shifted in the time-frequency space in such a way that it overlaps with the original pattern at exactly one time-frequency position. In this section, we extend the hit-array to the hit-matrix, which is applicable to frequency-hopping codes for MIMO radar under the large Doppler scenario. We define the hit-matrix [1] for the code matrix C as $$H = [h_{k,l}]_{(2Q-1)\times(2K-1)}$$ , (14) $$-Q < k < Q$$ , $-K < l < K$ , $k, l \in Z$ , where $$h_{k,l} = \sum_{m=0}^{M-1} \sum_{m'=0}^{M-1} \sum_{q=0}^{Q-|k|-1} \delta[c_{m,q} - c_{m',(q+|k|)} + l. \, sgn(k)]$$ (15) Here $\delta$ [.] refers to the Kronecker delta function and $$sgn(k) = \begin{cases} 1 & k \ge 0 \\ -1 & otherwise \end{cases}$$ (16) It is intuitively simpler to express the hit-matrix as follows $$H = \sum_{m=0}^{M-1} \sum_{m'=0}^{M-1} \widehat{H}(m, m')$$ (17) where $$\widehat{H}(m, m') = [\widehat{h}_{k,l}(m, m')]_{(2O-1)\times(2K-1)}$$ (18) and $$\hat{h}_{k,l}(m,m') = \sum_{q=0}^{Q-|k|-1} \delta[c_{m,q} - c_{m',(q+|k|)} + l. \, sgn(k)]$$ (19) $\widehat{H}_{m,m'}$ is called the cross-hit array and it contains information about the hits occuring between the waveforms $u_m(t)$ and $u_{m'}(t)$ . We shall now describe how the hit-matrix of a frequency hopping code relates to its ambiguity function. The MIMO radar ambiguity function can also be expressed as $$\chi(\tau, v, f, f') = \Omega(\tau, v, f, f') \left[ \sum_{l=0}^{L-1} e^{j2\pi v T_l} \right]$$ (20) where $$\Omega(\tau, v, f, f') = \sum_{m=0}^{M-1} \sum_{m'=0}^{M-1} \sum_{q=0}^{Q-1} \sum_{q'=0}^{Q-1} G_{m,m',q,q'}(\tau, v) e^{j2\pi(fm-f'm')\gamma}$$ (21) and $$G_{m,m',q,q'}(\tau,\upsilon) = \chi_{rect}(\tau + (q - q')\Delta t, \upsilon + (c_{m,q} - c_{m',q'})\Delta f)e^{j2\pi(\upsilon + (c_{m,q} - c_{m',q'})\Delta f)q\Delta t}e^{-j2\pi c_{m',q}\Delta f\tau}$$ $$(22)$$ #### B. Cost Function We now describe how frequency-hopping codes can be optimized under the large Doppler scenario to yield a desirable ambiguity function. Since the second product term in the MIMO radar ambiguity function is not dependant on the choice of code matrix C, we only concern ourselves with the optimization of the first term $\Omega(\tau, v, f, f')$ . To apply heuristic search algorithms like simulated annealing, we require a cost function that allows the desirability of different codes to be compared. The cost function based on the hit-matrix is defined as $$g_p(c) = \sum_{k=-Q+1}^{Q-1} \sum_{l=-K+1}^{K-1} (h_{k,l})^p$$ (23) Given that the hit-matrix contains a significant amount of information about the nature of the ambiguity function. This allows the heuristic search algorithms using this cost function to rapidly traverse the code space, thereby allowing good codes to be found faster. Increasing the value of p increases the penalty on higher side lobes. #### C. Optimization Algorithm We now describe how we apply simulated annealing using $g_p(C)$ . We use a slightly modified form of simulated annealing called Quantum Simulated Annealing (QSA), which allows faster convergence. The steps of the algorithm are as follows - 1) Randomly draw a code matrix C from $\{0, 1, \ldots, K -1\}^{MQ}$ such that the code is orthogonal, that is, $c_{m,q} \neq c_{m',q}$ for $m \neq m'$ . - 2) Randomly draw j from $\{1, 2, ..., J\}$ . - 3) Set C' = C, and repeat steps 3(a) to 3(c) i times. - a) Randomly draw m from $\{0, \ldots, M-1\}$ and q from $\{0, \ldots, Q-1\}$ . - b) Select k from $\{0, \ldots, K-1\}$ with $k \neq \{c'_{m,q}, \forall m\}$ . - c) Set $c'_{m,q} = k$ . - 4) Randomly draw U from [0, 1]. - 5) If $U < \exp((g_p(C) g_p(C'))/T)$ , then set C = C'. - 6) Set $T = \alpha T$ and $J = \beta J$ . - 7) If a sufficiently small value of $g_p(C)$ has been obtained, terminate the algorithm. Otherwise, return to step 2. The parameters for above algorithm are temperature (T), rate of decrease of temperature ( $\alpha$ ), jump size (J) and rate of decrease of jump size ( $\beta$ ). The algorithm is initialized with a value of T > 0 and J > 0, choosing $\alpha$ and $\beta$ from (0, 1). #### V. SIMULATION RESULTS Based on the described optimization algorithm in Section IV, the numerous frequency-coding waveform sets with arbitrary waveform lengths and various numbers of waveforms can be designed. We have designed many different lengths of sequences using Quantum Simulated Annealing (QSA) method but present only the correlation properties of sequences of length 32 for comparison with Liu's sequences [3]. In this paper all the Autocorrelation Side lobe Peak (ASP) and Cross-correlation Peak (CP) values are normalized with respect to the sequence length. Table I lists the three code sequences for DFCW\_FF with length N=32 and L=3. The ASPs and CPs for these sequences are shown in Table II. Table III shows the comparison between our results and the Liu's results. From the Table III, it is observed our designed max CP and mean CP have lower values than Liu's sequences of the same length [3]. Table IV lists three coding sequences for DFCW LFM with length N = 32, L = 3, $T\Delta f = 3$ , and TB = 72. The decrease in the cost with respect to increase in no. of iterations of simulated annealing for DFCW FF and DFCW LFM has been observed in Fig. 3 and Fig. 5 and the aperiodic autocorrelation and crosscorrelation functions for the three sequences of DFCW\_FF and DFCW\_LFM are displayed in the Fig. 4 and Fig. 6 respectively. The ASPs and CPs of DFCW LFM are listed in Table V. Table VI shows the comparison between DFCW LFM and DFCW FF. From the Table VI, it is observed that DFCW LFM has much lower ASP than DFCW\_FF of the same length [3]. The mean ASP of the DFCW LFM is about -24.31dB, and is about 13dB lower than that of the DFCW FF. #### VI. CONCLUSION This paper presented the hit-matrix as an analysis tool for frequency-hopping waveforms under the large Doppler for MIMO radar. Using quantum simulated annealing, cost functions have been presented based on the digitized ambiguity function to obtain good optimized frequency- hopping waveform for orthogonal MIMO radar. This optimization approach can get superior aperiodic correlation properties to any existing sequences in literature. This approach provides an alternatively powerful tool for the design of multiple orthogonal discrete frequency-coding sequences with the good aperiodic correlation. By observation of simulated results of DFCW FF with DFCW LFM, it prove that replacing the fixed frequency individual pulses with LFM pulses can mitigate the ASP, as well as nullify the grating lobes problem. Imposing two specific relationships on the two signal parameters ( $T\Delta f$ and TB) results in mitigation of the ASP. With growth of sequence length the ASP and CP decreases. A larger code length allows more degrees of freedom in the optimization of cost function and thus results in possible improved DFCW LFM. TABLE I DISCRETE FREQUENCY CODING SEQUENCES WITH FIXED FREQUENCY PULSES WITH N = 32, L = 3 AND K = 288 | Code | | | | Sequ | ences | | | | |------|-----|-----|-----|------|-------|-----|-----|-----| | | 41 | 231 | 145 | 75 | 262 | 109 | 109 | 175 | | | 62 | 99 | 149 | 269 | 281 | 160 | 272 | 26 | | 1 | 52 | 28 | 85 | 32 | 58 | 35 | 104 | 86 | | | 36 | 169 | 15 | 228 | 286 | 43 | 54 | 108 | | | 134 | 101 | 186 | 61 | 280 | 7 | 193 | 266 | | | 238 | 231 | 258 | 15 | 125 | 126 | 191 | 141 | | 2 | 49 | 127 | 26 | 90 | 19 | 221 | 1 | 282 | | | 210 | 216 | 31 | 270 | 145 | 39 | 257 | 258 | | | 143 | 258 | 194 | 9 | 31 | 218 | 155 | 276 | | | 61 | 170 | 58 | 129 | 177 | 145 | 227 | 215 | | 3 | 264 | 184 | 278 | 281 | 276 | 66 | 29 | 174 | | | 285 | 94 | 20 | 287 | 141 | 257 | 122 | 264 | Fig. 3: Decrease in cost function versus iterations of simulated annealing for DFCW\_FF Fig. 4: Autocorrelation and cross-correlation functions of DFCW\_FF Sequences with code length N=32 and set size L=3 with K=288 TABLE II ASP AND CP OF THE DESIGNED DFCW\_FF SET WITH N = 32, L = 3 AND K = 288 | | Code 1 | Code 2 | Code 3 | |--------|--------|--------|--------| | Code 1 | 0.2989 | 0.0512 | 0.0575 | | Code 2 | 0.0512 | 0.2628 | 0.0441 | | Code 3 | 0.0575 | 0.0441 | 0.2389 | TABLE III COMPARISON BETWEEN VALUES IN [3] AND OUR DESIGNED VALUES WITH N = 32, L = 3 AND K = 288 | | Max | Mean | Max | Mean | |-------------------|---------|---------|--------|--------| | | ASP(dB) | ASP(dB) | CP(dB) | CP(dB) | | Literature values | -13.86 | -13.90 | -23.02 | -23.25 | | Our designed | -10.48 | -11.47 | -24.80 | -25.86 | | values | | | | | TABLE IV DISCRETE FREQUENCY CODING SEQUENCES WITH LFM PULSES With N = 32, L = 3, K = 288, $T\Delta f = 3$ And TB = 72 | | | , - | 7 | ) | | | | | |------|-----|-----|-----|------|-------|-----|-----|-----| | Code | | | | Sequ | ences | | | | | 1 | 263 | 201 | 154 | 105 | 271 | 45 | 264 | 62 | | | 280 | 223 | 12 | 62 | 34 | 48 | 272 | 102 | | | 274 | 93 | 175 | 134 | 11 | 52 | 14 | 281 | | | 28 | 264 | 2 | 213 | 286 | 270 | 154 | 43 | | 2 | 106 | 260 | 261 | 1 | 76 | 211 | 115 | 141 | | | 175 | 56 | 269 | 12 | 256 | 272 | 26 | 263 | | | 143 | 101 | 59 | 288 | 117 | 239 | 3 | 204 | | | 173 | 266 | 45 | 144 | 264 | 152 | 176 | 104 | | 3 | 65 | 261 | 91 | 128 | 4 | 10 | 87 | 250 | | | 86 | 284 | 147 | 52 | 190 | 139 | 179 | 43 | | | 221 | 70 | 76 | 2 | 156 | 180 | 118 | 195 | | | 284 | 285 | 56 | 287 | 86 | 98 | 188 | 118 | Fig. 5: Decrease in cost function versus iterations of simulated annealing for DFCW LFM Fig. 6: Autocorrelation and cross-correlation functions of DFCW\_LFM Sequences with code length N = 32 and set size L = 3 with K = 288 TABLE V ASP AND CP OF THE DESIGNED DFCW\_LFM SET WITH N=32, L=3, K=288, $T\Delta F=3$ AND TB=72 | | Code 1 | Code 2 | Code 3 | |--------|--------|--------|--------| | Code 1 | 0.0591 | 0.0625 | 0.0581 | | Code 2 | 0.0625 | 0.0612 | 0.0734 | | Code 3 | 0.0581 | 0.0734 | 0.0622 | # TABLE VI COMPARISON BETWEEN DFCW\_FF [3] AND DFCW\_LFM WITH $N = 32, L = 3, K = 288, T\Delta F = 3 AND TB = 72$ | | Max<br>ASP(dB) | Mean<br>ASP(dB) | Max<br>CP(dB) | Mean<br>CP(dB) | |----------|----------------|-----------------|---------------|----------------| | DFCW_FF | -10.48 | -11.47 | -24.80 | -25.86 | | DFCW_LFM | -24.12 | -24.31 | -22.68 | -23.78 | #### REFERENCES - [1] S. Badrinath, Anand Srinivas and V. U. Reddy, "Low complexity design of frequency hopping codes for MIMO radar for arbitrary Doppler," submitted to *EURASIP Journal on Advances in Signal Processing*, Feb. 2010 - [2] G. San Antonio, D. R. Fuhrmann, and F. C. Robey, "MIMO Radar Ambiguity Functions," IEEE Journal of Selected Topics in Signal Processing, vol. 1, pp. 167-177, Jun. 2007 - [3] Bo Liu, Zishu He., "Optimization of discrete frequency coding waveform for MIMO Radar," *IEEE Inte. Conf. Commun., Circuits and Systems (ICCCAS'07)*. Kokura, Fukuoka, Japan, June. 2007, 2:966-970 - [4] Chun-Yang Chen and P. P. Vaidyanathan, "MIMO Radar Ambiguity Properties and Optimization Using Frequency-Hopping Waveforms", *IEEE Trans. on Signal Processing*, pp. 5926-5936, Dec. 2008 - [5] J.R. Bellegarda, S.V. Maric and E.L. Titlebaum, "The hit array: a synthesis tool for multiple access frequency hop signals," *IEEE Trans. On Aerospace and Electronic Systems*, vol.29, no.3, pp.624-635, Jul. 1993 - [6] N. Levanon and E. Mozeson, *Radar Signals*, Wiley-IEEE Press, 2004 ♦♦♦ # Status Monitoring and Controlling of Doppler Weather Radar using RC- Software #### Mahalakshmi N.C The Oxford College Of Engineering, Bommanahalli, Bangalore-68 E-mail: Mahalakshmi.nbc@gmail.com Abstract - Doppler Weather radar is a long-range surveillance system that measures the rain fall intensity during severe weather conditions and gives information about severity of cyclones. This paper presents design and development of Radar Controller (RC) software for Doppler Weather Radar (DWR). It helps to control DWR operation remotely using web server and JNLP protocol. RC software helps the operator for Smooth Operation, Radar Calibration and Easy Maintenance of the Radar. Operating mode, scan commands and parameters to major subsystems can be selected from the RC. RC generates a volume header packet and communicates to all subsystems through Ethernet to configure all subsystems for the mode of operation selected. During the scan operation, RC monitors the subsystems status continuously. If the status is not ok, the scan operation is terminated. RC initiates the archival of the products and Near Real Time product generation at the end of the scan operation. RC facilities round the clock continuous Weather monitoring through the Scheduler mode of Radar operation. Keywords: - Real-time archiving, Doppler Weather Radars, Networking, Data archival and retrieval techniques, Radar Controller. #### I. INTRODUCTION Doppler Weather Radar system is the result of the indigenous Design and Development activity carried out at RDA/ISTRAC and installed at Sriharikota for IMD. Radar [1] operates by radiating electromagnetic energy and detecting the echo returned from reflecting objects (targets). Radar can detect relatively small targets at near or far distances and can measure their range with precision in all weather, which is its chief advantage when compared with other sensors. However, radar has seen significant civil applications for the safe travel of aircraft, ships, and spacecraft; the remote sensing of the environment, especially the weather and many other applications. A Simple diagram depicting the working of Radar is shown below. Fig1: Simple working of Radar Doppler Principle: When the source for signals and the observer are in relative motion, there is a change in frequency observed by the observer. In case the observer and the source are moving closer, frequency increases and vice versa. This was first discovered by Austrian physicist Christian Doppler, hence named after him. Radars working on this phenomenon are called Doppler Weather Radars. Doppler Weather Radars are coherent pulsed radars and provide information about intensity and internal velocity of different hydrometers in a severe weather system. The difference between the Doppler Weather Radar (DWR) and other types of Pulse Doppler Radars emanates basically from the nature of the target. The target for weather radar being hydrometeors is distributed in nature and generally fills the radar beam. The dynamic parameters of the target are measured by sampling the process at PRF (Pulse Repetition Frequency) of the radar. The return signal echoes from the target sampled at PRF rate thus forms a time series containing useful information about the dynamic properties of the target. The DWR basically estimates the three base products viz., Reflectivity (Z), Radial Velocity (V), and spectrum width $(\sigma)$ as a function of range. From these three parameters advanced data products are generated to meet the different hydro meteorological applications. The received power is a measure of precipitation[2][4]. Radar subsystem configuration plays an important role for any radar operation. The various subsystems of a DWR include Transmitter, Servo, Coherent Signal Generator, Signal Processor etc. Performing a single subsystem configuration and monitoring takes much time; to avoid such time complexity, RC application software helps the operator to configure and monitor the subsystems altogether and thus provide the overall status information in order to increase the speed. Our work primarily focuses on developing software for DWR. In this paper, we discuss the importance of RC software, its applicability in Doppler Weather Radars, its availability to the users and also the level of authentication provided according to the need and desires of the users. #### II. BACKGROUND DETAILS Based on the requirement of the target detection, Radars are classified to work on various bands such as S-Band, C-Band, X-Band, Ka-Band etc. Our paper focuses on Doppler Weather Radar which operates at a frequency of 2-4GHz, covering a range of 400km, with a wavelength of 8-15cm. thus making it useful to be deployed in coastal regions of India. Primarily, the radar consists of a transmitter to generate microwave signal, an antenna to send the signal out to space and to receive energy scattered (echoes) by targets around, a receiver to Detect and process the received signals by means of processors and a display to graphically present the signal in usable form. In Figure 2 of radar with klystron transmitter, an RF carrier signal is generated in the receiver and fed to the klystron amplifier. The transmitter generates a microwave pulse by means of a pulse-modulated amplification of the carrier signal by the klystron. The microwave pulse is routed through the duplexer and radiated by the antenna. During this transmit phase the receiver is blocked by the T/R limiter of the duplexer which prevents leakage from the circulator of the duplexer to the highly sensitive receiver input stages. The antenna emits the transmitter pulse in a symmetrical pencil beam[5]. The atmosphere around the radar is scanned by moving the antenna in azimuth and elevation following meteorological scanning strategies. After the transmit pulse is terminated, the T/R limiter extinguishes and thus connects the receiver via the circulator to the antenna. The receive phase starts and the receiver starts acquiring the signals scattered by the targets. This phase lasts until the next pulse is transmitted. Due to its high sensitivity and its large dynamic range; receiver is capable of the detection of far clear-air echoes as well as strong signals from close thunderstorms. After receipt of the echoes, processing units performs the tasks to convert those signals to required products to display on indicators/displays. The block diagram for weather radar is shown in the figure below. Fig 2: Basic weather radar Block diagram ## III. RADAR CONTROLLER AND SIGNAL PROCESSING A. Radar controller: In modern radars, controlling the function of each sub-unit and processing the signals are done by dedicated computers incorporated within the system. The radar control processor is responsible for the control and supervision of the radar system. The states of a large number of subsystem parameters are monitored and if a fault is detected the control processor acts according to the severity. RC helps the operator for smooth Operation, Quick Calibration and Easy Maintenance of the radar. RC helps the operator to select the desired operating mode. For the selected mode, RC guides the operator to select the system and subsystem parameters. RC generates a volume header, providing information about the chosen modes of the operation and parameters to the other related subsystem like Digital Signal Processor and workstations. It interfaces major subsystems of Radar like Transmitter, Receiver, Angle Servo and Simulator. communicates to all these subsystems and sets the required parameters of these subsystems for the chosen mode. **B. Signal Processing:** Weather Radars employ high dynamic-range linear receiver and DSPs (digital signal processors) to extract information from the received echo power. Linear receiver output in intermediate frequency (IF) and analog form is converted to digital form in the analog-to-digital converter and fed to digital filters to split the power into in-phase (I) and quadrature (Q) components. DSPs process the raw I/Q data and perform phase and amplitude correction, clutter filtering, covariance computation and produce normalized results. These normalized results are tagged with angle information, headers and given out as a data set. Covariance computation is based on pulse pair processing. Intensity estimation consists simply of integrating the power in the linear channel ( $I^2 + Q^2$ ) over range and azimuth. The resulting power estimate is corrected for system noise, atmospheric attenuation and transmitter power variations. The signal processing of the linear channel ends with the estimation of reflectivity, mean radial velocity and velocity spectrum width. Fig 3: Signal processing and Product Generation C. Meta Data: A number of products are available from weather radars. From non-Doppler radars, information on reflectivity factor alone is available, whereas from DWRs, in addition to reflectivity, radial velocity and spectrum width information are also available as base data. These data can be used directly for base-products display and also for deriving further products based on standard algorithms. A few products which are commonly used in operational meteorology are briefly described here. #### Base products Reflectivity factor (*Z*), radial velocity (*V*) and velocity spectrum width (*W*) is the three base data directly observed/ measured by the radar. The radar reflectivity factor is defined as $Z=10\log_{10}[\sum (N_iD_i^6)/(1mm^6/m^3)]$ (1) where *N* is the number of droplets of diameter Di to Di+ $\delta$ D, $\delta$ D being the diameter interval used in making the measure present in unit volume of sample being probed. For the conditions prevailing in most of the weather systems and for the wavelengths used, the scattered power received back is directly proportional to Z (derived by Lord Rayleigh in 1870s). Hence the weather signal power available at the receiver output is a direct measure of Z. Autocorrelation of time series formed by the received power spectrum is the basis for deriving Z and other Doppler moments. The zeroth lag autocorrelation R0 of the time series is proportional to weather signal power, and hence Z is computed from it. Mean radial velocity of hydrometeors inside the sample volume is given by the first lag autocorrelation R1, and the velocity spread inside the sample volume is obtained from the first and the second lag autocorrelations R1 and R2 together, assuming a Gaussian distribution. The base data available from DWRs are generally displayed in the following formats. PPI: The PPI(Z) is quite similar to conventional radar scope display of Z for a given elevation at all azimuth values, with colour-coded schemes for display and storage in digital form. This display is possible for all elevations at which data are collected. This product is available for display immediately on completion of the scan. This is the most widely used form of weather radar display. A typical PPI(Z) display has been depicted in Figure. Fig: PPI (Z) data product plot #### IV. PROPOSED SYSTEM RC can be considered as the heart of the Radar system. In the first stage of the working model, RC sends Status Request packet to all the subsystems in Broadcast mode. Various health parameters of each subsystem are then received by the RC. Each Subsystem sends data with their specific IP Address and port number. This information is combined to make respective Packet Headers of subsystems. RC combines all the status information along with the respective Packet Header to form a Status Header and send to the Operator. If the overall Status of each sub system is OK, then the RC software creates Volume Header else alerts to stop the scan operation. RC reads the mode of operation selected by the operator, sends the corresponding Volume Header data in Broadcast mode to all subsystems i.e. Transmitter, STALO & Corx, Simulator, SERVO, RSP and Workstations.RC sends START Command data in Broadcast mode to all the subsystems including Workstations for configuring subsystems, checks the elevation values to track the Number of elevation scans completed. It then sends STOP command to subsystems in Broadcast mode at the end of the scan. RC sends ARCHIVAL START Command to workstations in Broadcast mode, after the issue of STOP command and waits for ARCHIVAL COMPLETION FLAG from the workstation. Archival workstation sends this flag with its specific IP Address. If ARCHIVAL COMPLETION FLAG has not been received within the specified time, RC gives the warning message[3]. The working of RC using different headers is depicted in the Fig.4 Fig 4. Working of RC with different Headers An architecture implementation to archive the weather radar data set is made available. The current state of weather radar data archival is insufficient for researchers who want real-time access to the data. By distributing files over web servers and retrieving those using HTTP URLs, network protocols are used. Further, the system is extensible as the files can be distributed across multiple servers (which may not even all belong to the same organization). By isolating researchers from the burdens of data handling, researchers can spend more time on developing experiments and using the large data set to its full potential. We demonstrated the feasibility of the architecture by implementing it to archive weather radar data generated by the Radar Development Area Lab system. The valuable data is archived and further can be used by researchers and students. The base data will be archived onto the NAS system, which is a RAID, based system, with hot standby. In case of failure of the primary, the other system will automatically take over for archive. The system will cater for a minimum of three months of base data storage. At the end of three months, data will be archived onto the Tape archive (DAT/DLT). This is shown in the Fig.5 Fig.5 Architecture Diagram #### V. EXPERIMENTAL RESULTS The above said working of the software has been successfully implemented in RDA labs of ISTRAC, Bangalore and the results have been tested using a Simulator developed at the RDA Labs. The results of the software were successfully observed using the user friendly simulator and GUIs developed for the same. #### VI. CONCLUSIONS This Paper is an initiative to enable radar operation over the internet. This initiative has two primary parts, Web Monitoring and Control of Doppler Weather Radar and second is to make the archive data available to the remote user. The radar parameters estimated at the radar site are delivered to any location that has the internet connectivity and displayed either in real time or archive mode. The radar console is also virtually transferred to remote site. The graphic user interface allows the remote users to control the radar system as well as the display options. This paper helps the user to select operating parameters like scan functions, basic radar waveform parameters like the pulse width, PRF and Signal processing and Product generation parameters through a menu. It communicates in real-time with DSP PC's, workstations and a server through the Ethernet interface, sends commands and receives status information from all subsystems through RS-422 interface. The Auxiliary RC generate SH and DH parameters which is transmitted to RC through RSP. This project helps in controlling the function of DWR, diagnosing the radar subsystems via remotely does not exist and whenever subsystem problems occur, the designer need to go to the site location and do the maintenance. This consumes a considerable amount of time and due to this DWR comes to halt which affects the radar operation. #### VII. BIBLIOGRAPHY - Merrill I. Skolnik, "Radar Handbook", 3<sup>rd</sup> Edition, McGraw Hill Publications - Baldini, L., E. Gorgucci, and V. Chandrasekar, "Hydrometeor classification methodology for C-band polarimetric radars", Proceedings of ERAD, pp. 62-66, 2004 - 3. J. Shanmuga Sundari *Documentation for Radar Controller*, ISTRAC, ISRO 4. - Bringi, V.N., T.D. Keenan, and V. Chandrasekar, "Correcting C-band radar reflectivity and differential reflectivity data for rain attenuation: a self-consistent method with constraints", IEEE Trans. Geosci. Remote Sens., vol. 39, pp. 1906–1915, 2001 - Doviak, R., and D. Zrnic, Doppler Radar and Weather Observations, 2nd edition, San Diego: Academic Press, 1993. - Anagnostou, E. N., M. N. Anagnostou, W. F. Krajewski, A. Kruger, and B. J. Miriovsky, 2004: High-resolution rainfall estimation from X-band polarimetric radar measurements. J. Hydrometeor., 5, 110–128. ## ESTIMATION OF DEPTH MAP USING MOTION VECTOR FOR 2D TO 3D CONVERSION #### Kiran P. More, Saurabh P. Supe, Prof. V. N. More Department of Electronics and Telecommunications, Government College of Engineering, Pune -411005,India Email:-morekp10.extc@coep.ac.in, supesp10.extc@coep.ac.in,vnm.extc@coep.ac.in Abstract - the next major advancement in television is stereoscopic three-dimensional television (3D-TV). This paper presents a method for 2D to 3D video conversion based on motion information. This method deeds the different relations of pixels matching after motion estimation to generate depth map of the 2D video. On the edge of moving object, pixels matching degree is utilized to judge that the pixel belongs to foreground or background. The contour of moving object will become more distinct. Furthermore, to the pixel which hasn't get correct matching, corresponding approach is given according to the object motion. Results shows that the proposed method improves quality of 3D video. Keywords- 3D video, Depth map. #### I. INTRODUCTION Three-dimensional television (3D-TV) is the next step in the advancement of television. Stereoscopic images that are displayed on 3D-TV are increasing visual impact and heighten the sense of presence for viewers. The 3D-TV display also provide multiple stereoscopic views, offering motion parallax as well as stereoscopic information. In this 2D to 3D conversion method, there are two methods one is using motion vector &second is the edge information, the motion vector computing algorithms are used to find out the depth of image. The minimum error block and the corresponding motion vectors for the respective blocks in the reference frame. In this paper we will discuss some of these algorithms, and their computational complexity and efficiency. Video compression is vital for efficient storage and transmission of digital signal. The hybrid video coding techniques based on predictive and transform coding are adopted by many video coding standards such as ISOMPEG-1/2 and ITU-T H.261/263[9, 4]. Motion estimation is a predictive technique for exploiting the temporal redundancy between successive frames of video sequence [6, 7]. Block matching techniques are widely used motion estimation method. The section I gives a brief introduction about motion estimation and depth map techniques and the next, section II is devoted to block matching technique and criteria. This discusses some notions and parameters related to motion estimation. Section III contains block matching algorithms and Section IV discusses the method of depth map generation. Also it summarizes the 3D synthesis and results, followed by conclusion in the next section. #### II. MOTION ESTIMATION(ME) The goal of Motion Estimation is to reduce the total amount of bits required for transmission or storage of the frames of an videosequence [1,2]. The basic premise of motion estimation is that in most cases, consecutive video frames will be similar except for small changes induced by objects moving within the frames. An MPEG encoder exploits temporal and spatial redundancies in consecutive frames. Because two successive frames of a video sequence often have small differences (except in scene changes), the MPEG-standard offers a way of reducing this temporal redundancy as shown in fig 1. This Fig shows the motion estimation block and finding of motion vector. Which is useful to reduce bandwidth ofchannel. а Fig1. MPEG encoder block Diagram #### A. Block Matching Criteria Block-matching motion estimation (BMME) is themost widely used motion estimation method for video coding. The current frame is first divided into M×N pels. The algorithm then assumes that all the pels within the block undergo the same transitional movement. Thus the same motion vector, $d{=}[dx,dy]T$ is assigned to all the pels in the block. This motion vector is estimated by searching for the best match block in a larger search window of $(M{+}2d_{mx}){\times}(N{+}2d_{my})$ pels centered at the same location in a reference frame, where $d_{mx}$ and $d_{my}$ the maximum allowed motion displacements in the horizontal and vertical directions, respectively. Inter frame predictive coding is used to eliminate the large amount of temporal and spatial redundancy thatexists in video sequences and helps in compressing them. In conventional predictive coding the difference between the current frame and the predicted frame is coded and transmitted. The better the prediction, the smaller the error and hence the transmission bit rate There are a number of criteria to evaluate the "goodness" of a match. The popular matching criteria used for block-based motion estimation are: a) Mean of squared error (MSE): Considering (k -l) as the past references frame where, l > 0 for backward motion estimation, the mean square error of a block of pixels computed at a displacement (i, j) in the reference frame is given by: $$\begin{aligned} \text{MSE}(i,j) &= \frac{1}{N^2} + \sum_{n_1=0}^{N-1} \sum_{n_2=0}^{N-1} \left| S \left( n_1, n_{2,k} \right) - S (n_1 + i, n_2 + j, k - l) \right|^2 \end{aligned}$$ Where, i and j are integers with respect to the candidate block position. The MSE is computed for each displacement position (i, j).MSE is the displacement vector which is more commonly known as motion vector and is given as: $$[d_1, d_2] = \arg\min\{MSE(i, j)\}(2)$$ b) SAD Criterion: Like the MSE criterion, the sum of absolute difference (SAD) too makes the error values as positive, but instead of summing up the squared differences, the absolute differences are summed up. The SAD measure at displacement (i, j) is defined as: $$\begin{split} \text{SAD}(i,j) &= \frac{1}{N^2} + \sum_{n_1=0}^{N-1} \sum_{n_2=0}^{N-1} \left| S\!\left(n_1, n_{2,k}\right) - S\!\left(n_1 + i, n_2 + j, k - l\right) \right| \! \left| 3 \right) \end{split}$$ #### B. Block Size Another important parameter of the (Block matching algorithm) BMA is the block size. 1) Smaller the block size it achieves better prediction quality also reduces the effect of the accuracy problem. In other words, with a smaller block size, there is less possibility that the block will contain different objects moving in different directions. 2) If the block size is larger, then it will introduce the prediction errors in it. So looking at all the aspects related with the prediction quality and the accuracy, the optimized block size is decided to be $8\times8$ . H.263 and MPEG standards allow adaptive switching between block sizes of $16\times16$ and $8\times8$ on a Macro Block (MB) basis[4]. #### C. Search Range The maximum allowed motion displacement (dm), also known as the search range, has a direct impact on both the computational complexity and the prediction quality of the BMA. A small $d_m$ results in poor compensation for fast-moving areas and consequently m poor prediction quality. A large dm, on the other hand, results in better prediction m quality but leads to an increase in the computational complexity (since there are $(2d_m+1)^2$ possible blocks to be matched in the search window). A larger dm can also result in longer motion vectors and consequently a slight increase in motion overhead. In general, a maximum allowed displacement of $d=\pm 15$ pels is sufficient for low-bit-rate applications. #### D. Block Matching Techniques Motion estimation and motion compensation is a predictive technique for exploiting the temporal redundancy between successive frames videosequence. Block matching techniques are widely used motion estimation method to obtain the motion compensated prediction. To represent the motion of each block, a motion vector is defined: as the relative displacement between the current candidate block and the best matching block within the search window in the reference frame. By splitting each frame into number of macroblocks, motion vector of each macroblock is obtained by using block matching algorithms which are as discussed below. #### III. MOTION ESTIMATION TECHNIQUE #### A Exhaustive Search (ES) This is the simplest search algorithm[11]. It goes on searching the full search window and the number of searches becomes $(2W+1)^2$ . Hence this search algorithm is not efficient at all. It is less complex for implementation, but it will induce unnecessary load on the processor by searching all the blocks within the search window[7]. Hence it is not used in finding the motion vectors. This algorithm, also known as Full Search(FS), is the computationally expensive block matching algorithm of all. This algorithm calculates the cost function at each possible location in the search window. As a result of which it finds the best possible match and gives the highest PSNR amongst any block matching algorithm. Fast block matching algorithms try to achieve the same PSNR doing as little computation as possible. The obvious disadvantage to FS is, the larger the search window gets the more computations it requires. #### B. Three Step Search (TSS) This search algorithm also uses the logarithmic methodFig2. It searches the centre point and the surrounding 8 points [10]. Initially the distance kept is W/2. Now the centre point and the surrounding 8 points are searched. These points are denoted by the point 1. Then the best match point is found out. Now the previous match point is used as the centre and for the next searches and the distance is reduced to the half the previous value. Then the surrounding 8 points are searched. These are denoted by point 2 shown in the Fig 2. Then the best match point is found from these 8 search points. Then this best match point is used as the centre for the next search. Then the distance is reduced to half the previous value i.e. it will be 1 in this case. Then it will search for the surrounding 8 match point. And then find out the best match points from these points and this will be desired minimum error block, and the corresponding vector will be the motion vector for this block. #### C. New Three Step Search (NTSS) The new three step search algorithm (NTSS)Fig3 has been proposed by Li, Zeng and Liou in 1994[10]. It is a modified version of the three step search algorithm for searching small motion video sequences. Therefore, additional 8 neighboring checking points are searched in the first step of NTSS. Fig 3 shows two search paths with d=7. The center path shows the case of searching small motion. In this case, the minimum matching point of the first step is one of the 8 neighboring checking points. The search is halfway stopped with matching three more neighboring checking points of the first step's minimum matching point. The number of checking points required is (17+3)=20. The upper right path shows the case of searching large motion. In this case, the minimum matching point of the first step is one of the outer eight checking points. Then the searching procedure proceeds in the same way as in the TSS algorithm. The number of checking points required in this step is (17+8+8)=33. #### D. Four Step Search (4SS) The four step search algorithm (FSS) Fig. 4has been proposed by L. M. Po and W. C. Ma in 1996 [12,15]. This algorithm also exploits the center biased characteristics of the real world video sequences by using a smaller initial step size compared with TSS. The initial step size is fourth of the maximum motion displacement d (i.e. d/4). Due to the smaller initial step size, the FSS algorithm needs four searching steps to reach the boundary of a search window with d=7. Same as the small motion case in the NTSS algorithm, the FSS algorithm also uses a halfway stop technique in its second and third step search. Fig 3 shows two search paths of FSS for searching large motion. For the lower left path, it requires (9+5+3+8)=25 checking points. For the upper right path, it requires(9+5+5+8)=27 checking points that is the worst case of the algorithm for d=7. Fig3 showstwo search paths of FSS for searching small motion. For the left path, it requires (9+8)=17 checkingpoints. For the right path, it requires (9+3+8)=20 checking points. There are either three or five checking points required in the secondor third step. Moreover, if the Minimum BDM checking point of that step is the center one, the step size is reduced by half and the algorithm Fig. 2.Three Step Search diagram. For d = 7, the number of checking points required (9+8+8)=25. The number of checking point required equals to $[1+8 \log 2 (d+1)]$ . Fig. 3. New three step search algorithm. directly jumps to the fourth step. If the step size of the fourth step is greater than one, then another four step search is performed with the first step equals to the last step of the previous search. The number of checking points required for the worst case is $[18\{\log 2 ((d+1)/4)\}+9]$ . #### E. Diamond Search (DS) The diamond searchFig. 5 is based on motion vector (MV) distribution of real world video sequences [16, 17]. This method employs two search patterns in which the first pattern called Large Diamond SearchPattern (LDSP) comprises nine checking points and forms a diamond shape. Thesecond pattern consists of five checking points make a Small Diamond Search Pattern(SDSP). The search starts with the LDSP and is used repeatedly until the minimumBDM point lies on the search center. The search pattern is then switched to SDSP. The position yielding minimum error point is taken as the final MV. The search process is shown in Fig 5.DS is an outstanding algorithm adopted by MPEG-4verification model (VM) due to its superiority to other methods in the class of fixedsearch pattern algorithms. #### f. Adaptive Rood Pattern Search (ARPS) ARPS [13] algorithm Fig 6 makes use of the fact that the general motion in a frame is usually ordered, i.e. if the macro blocks around the current macro block moved in a particular direction then there is a high probability that the current macro block will also have a similar motion vector. This algorithm uses the motion vector of the macro block to its immediate left to predict its own motion vector. An example is by the predicted motion vector, it also checks at a rood pattern distributed points, where they are at a step size of S = Max (|X|, |Y|). X and Y are the x-coordinate and ycoordinate of the predicted motion vector. This rood pattern search is always the first step. It directly puts the search in an area where there is a high probability of finding a good matching block. The point that has the least weight becomes the origin for subsequent search steps, and the search pattern is changed to SDSP. The procedure keeps on doing SDSP until least weighted point is found to be at the center of the SDSP. A further small improvement in the algorithm can be to check for Zero Motion Prejudgment, using which the search is stopped half way if the least weighted point is already at the center of the rood pattern. The main advantage of this algorithm over DS is if the predicted motion vector is (0, 0), it does not waste computational time in doing LDSP; it rather directly starts using SDSP. Furthermore, if the predicted motion vector is far away from the center, then again ARPS save on computations by directly jumping to that vicinity and using SDSP, whereas DS takes its time doing LDSP. Care has to be taken to not repeat the computations at points that were checked earlier. Care also needs to be taken when the predicted motion vector turns out to match one of the rood pattern locations. We have to avoid double computation at that point. For macro blocks in the first column of the frame, rood pattern step size is fixed at 2 pixels. #### IV. METHOD USED FOR DEPTH GENERATION Another reason for their improvement over DS is the provision of multiple half-step stops. It should bementioned that out of the three cross search pattern (CSP) based variants only New cross search pattern (NCDS) comes closer to the performance of ARPS. The othersalthough an improvement on DS, do not match the the performance of ARPS. #### A Extraction of motion vector Variable block sizes motion estimation is adopted for thisscheme. The adjacent frame is chosen as reference frame. Which type of block will be chosen is decided according to theacuteness degree of movement. Using this method not only canretain the object edge detail, but also reduce the quantity of wrong matching blocks on background. Variable block sizes motion estimation technology alreadyextractsrelatively accurate motion vector, but on the edge ofthe object on depth map, the proportion of big block stillhigher, resulting in the large sawtooth on the object edge. In order to distinguish the pixel of block on object edgebelongs to foreground or background; we analyze pixelmatching relation and give the corresponding processingmethods in different cases. It will improve the depth mapquality and generate better 3D video Fig .4.Four Step Search procedures. Fig.5. Diamond search method. Fig.6. Adaptive Root Pattern. #### B. Initial depth generation The object's depth is proportional to its displacement between two adjacent frames (the size of the motion vector). The initial depth generation algorithm is defined by [16]: $d_{(i,j)} = \lambda \sqrt{MV(i,j)_x^2 + MV(i,j)_x^2}$ (4 )Where, $MV(i,j)_x$ , $MV(i,j)_y$ are the motion vectors corresponding to the X axis and Yaxis direction Where d(i, j) is the depth value for pixel (i, j) and $\sqrt{MV(i,j)_x^2 + MV(i,j)_x^2}$ is the magnitude of motion vector. $\lambda$ is the depth adjusting coefficient. Depth value of each pixel can be adjusted by $\lambda$ . The depth map is gray-scale map within range of 0-255. In order to obtain the 3D video had a better parallax effect. We define that $\lambda = 255/Max$ (MV), where Max(MV) is the size of maximum of motion vectors in the frame. #### C. Depth generation based on pixel matching The experiment is extracting the motion vector is from two sequential frames, so it can reflect the object's movement. Each pixel of one block shares the same motion vector. Analyzing each pixel and its matching pixel further, finds the matching relations between pixels, which effectively solve the edge erosion problem. In the initial depth generation process, the block size can be adjusted automatically after variable block sizes motion vector extraction, but there are still many big blocks in the edge region. Therefore, the block on the edge of the object contains both foreground pixels and background pixels. It caused that the edge of the initial depth map has great saw tooth effect. Additionally, because of object's movement, some pixels can't find matching pixel from motion estimation between two adjacent frames. It cause that corresponding pixel chooses the wrong motion vector to generate depth value. Be aimed at above problem, on the edge of initial depth map, we adjust the depth value for each pixel according to different matching relations of pixels in two adjacent frames for better depth map [17]. In the two adjacent frames, the first frame is prediction frame named A, the second frame is reference frame named B. Object moved from the position in frame A to the position in frame B, and motion estimation is used to obtain the motion vectors. All pixels in frame A and frame B have the following four matching relations. #### D. pixels in A and B were one-to-one correspondence: On the edge of initial depth map foreground object is moving and background object is relatively static. The pixel which is from foreground object, the difference between pixel and its corresponding pixel in reference frame is small. If pixel belongs to background the difference between the pixel and the corresponding pixel in reference frame is great due to change background object[18]. Observing the luminance difference value between every pixel andits corresponding pixel to examine that pixel belongs to foreground or background .D<sub>ij</sub>the depth value: $$D_{ij} = \begin{cases} \frac{d_{(i-blksize,j)} + d_{(i+blksize,j)} + d_{(i,j-blksize)} + d_{(i,j+blksize)}}{4} G > SDy \end{cases}$$ (5) $d_{(i,j)}$ is initial depth value for pixel (i,j), $SD_Y$ is Difference threshold. If pixel luminance difference is greater than $SD_Y$ we say it is background pixel depth value is determined by usingthe initial depth of four neighboring pixels $d_{(i-biksize,j)}, d_{(i+blksize,j)}, d_{(i,j-blksize)}$ , and $d_{(i,j+blksize)}$ . If it is less than the threshold 7, we judge it as a foreground pixel. Its depth value was the initial depth value. $$SD_{Y} = \rho \sqrt{\frac{1}{N} \left( \sum_{i=1}^{H} * \sum_{j=1}^{W} (Y_{(i,j)} \bar{Y})^{2} \right)}$$ (6) $\bar{Y}$ = is the average value of luminance N is no. of pixel, H and W are frame Height and width, $\rho$ is a constant for better foreground & background pixel $\rho = 0.8$ 2) Multiple pixels in A corresponding to same pixel in B 3) Pixel in A can't find corresponding pixel in B 4) Pixel in B can't find corresponding pixel in A #### D. Depth map smoothing To improve depth map quality several approaches can be chosen, eg. median filter, neighborhood averaging, morphological filter etc. Since the depth map contains a few large noise block, median filter and neighborhood averaging filter may not get a good effect. Morphological filter is adopted for proposed 2D to 3D video conversion scheme. Erosion and dilation algorithm of morphological filter is used for depth smoothing. Erosion operation can erode the noise round the block, so it can eliminate the whole noise block. Dilation operation can fill the holes in foreground object. #### E. 3D video synthesize Finally, the screen parallax value is deduced from the depth value by DIBR and left and right eye view is generated. The resultant will be the 3D video information for each frame all the frames are used to reconstruct the 3D video [17]. #### F. Implementation and Results Seven different algorithms are implemented in MATLAB version 7.14.0.739 (R 2012) on Intel core i3 processor. Fig.7.shows the implemented results of the algorithms on "vipmen.avi". Table 1 and Table 2 shows computation result and the Depth map result of different algorithms. In depth map we know that FS gives the good result as compare the other algorithms although it's computational complexity is high as compare to them.It includes video sequences of 176 \* 144 .Different images are selected from the reference paper. In Table 2we are comparing Depth matrix result of ARPS, NTSS, DS, SPE, TSS, 4SS with Full search (FS) Depth matrix the values shows the . As the FS method is considered more accurate than other search methods, the Depth maps generated by these method are also more accurate. The other columns in Table 2 indicate the number of Depth values which are close to depth map values generated by FS. #### G. Conclusion From Table 2 4SS method is found more close to FS for container and mobile video sequences. Also, the other values of 4SS can be very close .if we apply some threshold for Depth value matching. Table 1. Computation point comparison of FS, ARPS, DS 4SS NTSS TSS algorithms | Video<br>Forma<br>t176<br>*144 | FS | AR<br>PS | DS | 488 | NTS<br>S | SES | TSS | |--------------------------------|-------|----------|------|------|----------|------|------| | akiyo | 204.2 | 4.9 | 12.2 | 15.8 | 15.8 | 17.0 | 23.2 | | | 828 | 495 | 071 | 081 | 283 | 909 | 121 | | coastg | 204.2 | 7.6 | 12.9 | 18.1 | 18.3 | 16.4 | 23.3 | | uard | 828 | 2 | 495 | 869 | 586 | 040 | 788 | | contai | 204.2 | 5.0 | 12.2 | 15.9 | 16.0 | 17.0 | 23.2 | | ner | 828 | 581 | 247 | 015 | 303 | 606 | 197 | | hall | 204.2 | 5.8 | 12.3 | 16.2 | 16.7 | 16.9 | 23.2 | | | 828 | 131 | 081 | 222 | 626 | 141 | 597 | | mobile | 204.2 | 5.0 | 12.2 | 15.8 | 15.9 | 17.0 | 25.2 | | | 828 | 455 | 071 | 586 | 369 | 783 | 121 | Table 2. Depth map result comparing depth matrix of | Video<br>Format176<br>*144 | FS | ARPS | DS | 4SS | NTSS | SES | TSS | |----------------------------|-----|------|-----|-----|------|-----|-----| | akiyo | 396 | 332 | 342 | 341 | 368 | 341 | 365 | | coastguard | 396 | 331 | 331 | 331 | 331 | 331 | 331 | | container | 396 | 380 | 380 | 391 | 383 | 382 | 381 | | hall | 396 | 340 | 340 | 347 | 391 | 354 | 340 | | mobile | 396 | 380 | 380 | 391 | 383 | 382 | 381 | H. Results Frame (T-1) Frame T Motion vector (Diamond search) Unsmoothed Depth map Smoothed Depth Map 3D Synthesized Result #### REFERENCES - D.B.Brown, "Motion-based foreground segmentation", tech.rep., Stanford University, March 13, 2000. - [2] S.-J. Kang, K.-R.Cho, and Y. H. Kim, "Motion compensated frame rate up-conversion using extended bilateral motion estimation," IEEE Trans. Consumer Elec., vol.53, no.4, pp.1758-1767, Nov. 2007. - [3] "Image and Video Compression Standards : Algorithm and Architecture" by Bhaskaran. - [4] "Video Codec Design" by Richardson. - [5] O.A. Ojo and G. de Haan, "Robust Motion-Compensated Video Up-Conversion," IEEE Trans. Consumer Electronics, vol. 43, no. 4, pp. 1045-1056, Nov. 1997. - [6] 'Adaptively Weighted Motion Vector-Median Filter based Motion Vector Smoothing' by L.Alparone, M. Barni on 1996 IEEE transaction. - [7] "Adaptive Median Filter Based Motion Compensation Frame Rate Up Conversion" by Suk- Ju Kang on 2009 IEEE transaction. - [8] BorkoFurht, Joshua Greenberg, Raymond West water,"Motion Estimation Algorithms For Video Compression". Massachusetts: Kluwer AcademicPublishers, 1997. Ch. 2 & 3. - [9] M. Ghanbari, Video Coding, An Introduction to Standard Codecs, London: The Institute of Electrical Engineers, 1999. Ch.2, 5, 6, 7, 8 - [10] Renxiang Li, Bing Zeng, and Ming L. Liou, "A New Three-Step Search Algorithm for Block Motion Estimation", *IEEE Trans. Circuits And Systems ForVideo Technology*, vol 4., no. 4, pp. 438-442, August 1994. - [11] Jianhua Lu, and Ming L. Liou, "A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation", *IEEE Trans. Circuits And Systems For Video Technology*, vol 7, no. 2, pp. 429- 433, April 1997 - [12] Lai-Man Po, and Wing-Chung Ma, "A Novel Four-Step Search Algorithm for Fast Block Motion Estimation", *IEEE Trans. Circuits And Systems ForVideo Technology*, vol 6, no. 3, pp. 313-317, June 1996. - [13] Shan Zhu, and Kai-Kuang Ma, "A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation", *IEEE Trans. Image Processing*, vol 9, no. 2, pp. 287-290, February 2000. - [14] Yao Nie, and Kai-Kuang Ma, "Adaptive Rood Pattern Search for Fast Block-Matching Motion Estimation", *IEEE Trans. Image Processing*, vol 11, no. 12, pp. 1442-1448, December 2002. - [15] Chun-Ho Cheung, and Lai-Man Po, "A Novel Cross-Diamond Search Algorithm for Fast Block Motion Estimation", IEEE Trans. Circuits AndSystems For Video Technology, vol 12., no. 12, pp. 1168-1177, December 2002. - [16] C. W. Lam, L. M. Po and C. H. Cheung, "A New Cross-Diamond Search Algorithm for Fast Block Matching Motion Estimation", Proceeding of - 2003 IEEE International Conference on Neural Networks and Signal Processing, pp. 1262-1265,Dec. 2003, Nanjing, China. Chenglei Wu, GuihuaEr, XudongXie, Tao Li, Xun Cao, Qionghai Dai,"A Novel Method for Semi-automatic 2D to 3D Video Conversion".3DTV Conference: The True Vision Capture, Transmission andDisplay of 3D Video, 2008, pp.65-68. - [17] Ideses, L. P. Yaroslavsky, B. Fishbain, "Real-time 2D to 3D video conversion". Journal of Real-Time Image Processing, 2007, 2010pp.3-9. - [18] C. Fehn, "A 3D-TV Approach Using Depth-Image-Based Rendering (DIBR)". Proceedings of Visualization, Imaging, and Image Processing,. Benalmadena, Spain. 2003. 482-487. - [19] Chenglei Wu, GuihuaEr, XudongXie, Tao Li, Xun Cao, QionghaiDai,"A Novel Method for Semi-automatic 2D to 3D Video Conversion".3DTV Conference: The True Vision Capture, Transmission and Display of 3D Video, 2008, pp.65-68. - [20] FengXu, GuihuaEr; XudongXie, Qionghai Dai, "2D-to-3D ConversionBased on Motion and Color Mergence". 3DTV Conference: The TrueVision - Capture, Transmission and Display of 3D Video, May 2008pp.205-208. - [21] Ideses, L. P. Yaroslavsky, B. Fishbain, "Realtime 2D to 3D video conversion". Journal of Real-Time Image Processing, 2007, 2□1□ pp.3-9. ### **Enhancement of Switching Time and Power of CMOS Devices** #### Anmol Mohanty & Chandrakanth Reddy Dept. of Electronics & ECE, IIT Kharagpur, Kharagpur, WB 721302, India E-mail: anmolmh@iitkgp.ac.in & chandrakanthc.iitkgp@gmail.com Abstract - MOSFETS have been used widely in electronic devices as the basic building blocks of processors, controllers, and switches etc. But perhaps the simplest and the most common among such a myriad of applications is the CMOS inverter with it's powerful switching characteristics. We make an attempt to present ways to further improve the switching characteristics and the transition region of the CMOS inverter without any significant trade-off losses. This indirectly leads to a MOS which operates at a lower power as will be shown. The tradeoffs, which do not degrade the device significantly, have been presented too. The latest techniques for achieving them have been discussed as well. #### I. INTRODUCTION Recent advancements in CMOS\* technology have permitted use of the devices in practically every nook and corner of the world. These devices have become so ubiquitous that it is hard to find oneself in an environment bereft of them. As the basic building block of any electronics they have virtually conquered the world. Shrinking at an enormous rate, billions of them are now fit into a size as small as a Business card. So any tiny bit of improvement in a facet of the device will have a tremendous effect on the mankind due to their sheer numbers. In this paper we examine methods, some mature, yet some others under active development, to improve the switching delay, power consumption in switching, switching speed. An added advantage of one of the method we use will also help in reducing crosstalk noise (which is due to the mutual capacitance between the metal lines). Although NMOS device is regarded as a device capable of delivering faster switching than CMOS, the CMOS has been considered noting it's considerably less power requirements due to lower currents over the spectrum of application of voltage input ( $V_{\rm in}$ ) to the transistor. We attempt to show that the solution to most of the above problems is the reduction of the capacitance, the parasitic capacitance in particular. In this context we particularly refer to an important work done at the University of California, Berkeley [3] which elucidates the use of Vacuum spacer, and Corner spacer technologies which utilize filling the gaps between the gate and the contacts with dielectric materials to reduce the effective parasitic capacitance. We further present how to reduce the transition region in the $V_{out}$ vs. $V_{in}$ characteristics curve of the CMOS inverter device potentially leading to faster switching by a much lesser voltage change which enables us to use circuits which can operate at substantially lower power levels. #### II. VACCUM SPACER TECHNOLOGY We are very well aware that capacitance of any system is based primarily on three things - 1) Area (A) - 2) Length(d) - 3) Dielectric(k) Clearly for a given un-scalable device of fixed dimensions the former two cannot be changed but we can achieve reduced capacitance by decreasing the latter. This can obviously be achieved by using materials of lower dielectric. An ingenious way would be to "trap" vacuum between a material of suitably low dielectric constant to give a 'virtual' feeling of lower effective dielectric and hence reducing the corresponding capacitance as graphically portrayed below in Fig 0. Fig. 0: A vacuum technology based MOS device. Red area is the gate. Blue region implies the substrate. The white region surrounding the gate shows the implementation of the vacuum technology. The vacuum exists inside the interlayer dielectric (surrounding the gate) which leads to lower capacitance. Parallel methods exist which utilize Silicon Nitride (etc) in "Nitride Spacer" CMOS technologies which among a myriad of benefits has probably the most important feature that it would be relatively easier to fabricate than a vacuum based technology. There is, however, a tradeoff involved, i.e. in the low power standby mode of the device, degradation of the 'on' current has to be taken into account because of using a low "k" dielectric material. Results obtained at the reference quoted earlier show that delay( or effectively speed) in the linear contact inverter improves by as much as 10 % while simultaneously reducing the afore-mentioned inverter switching energy by 25% using the vacuum spacer technology at a fixed V<sub>dd</sub> of .76V. Another observation of interest was that 43% power consumption reduction was obtained while using a linear contact inverter with vacuum spacer technology as opposed to a circularcontact inverter with oxide spacer (both were running at the same speed). This represents nearly a double rise in the improvement factor. Factual comparison of the vacuum spacer and the oxide spacer has been shown below diagrammatically based on results obtained in experiments conducted at the reference mentioned before. Fig. 1 : Comparisons of delays in Vacuum and Oxide spacer technologies. #### III. CORNER SPACER TECHNOLOGY This is another approach which successfully reduces the capacitance, called as the "Corner Technology". This consists of small high-k spacers present only at the gate-S/D (Source/ Drain) edges where they are needed to improve the 'on' and 'off' currents. The larger low-k spacer reduces the gate capacitance for improved speed and energy consumption. Suggested materials for high-k spacer could be Silicon Nitride or HfO<sub>2</sub> and low-k material may be silicon oxide or even vacuum. The figure below shows the corner technology implementation (mono-laver). Fig. 2 : Corner technology based device showing mono layer technology (inner layer) The need for this technology came into focus when the long channel transistor started getting more and more obsolete. As the gate length began to scale down to proportions achieved never before, the need for Gate spacers to make Lightly Doped Drain (LDD) was felt. The gate spacer material suggested for the same is usually SiO2. An improvement that can be suggested for this technology is to use 'dual spacer technology' which reduces the cell junction leakage current and uses lesser silicon in the cell array. Dual spacer technology involves two layers of material i.e. a thin SiO<sub>2</sub> spacer located beside the gate pattern (also referred to as the inner layer) and a thick Si<sub>3</sub>N<sub>4</sub> spacer outside the inner spacer of SiO2. A tradeoff, however, is that it increases the gate capacitance because of a relatively higher-k dielectric constant of the silicon nitride which leads to a loss in speed of the circuit. ## COMPARISION AMONG VARIOUS TECHNOLOGIES #### l) For devices in low-power standby mode. #### L v-Standby Devices | <b>+</b> | | | |-----------------|-----------------------|-----------------------| | Spacer | Improvement in Corner | Improvement in Corner | | | spacer | spacer | | | (speed) | (power) | | Silicon oxide | 19% 🛉 | 7% ♠ | | Silicon nitride | 19% ↑ | 7% ↑ | | Hafnium oxide | 11% 🛉 | 25% | | Vacuum | 33% 🕈 | 2% 👃 | #### 2) For High Performance Devices. #### High Performance Devices | Spacer | Improvement in Corner spacer | Improvement in Corner spacer | |-----------------|------------------------------|------------------------------| | | (spæd) | (power) | | Silicon oxide | 10% 🛉 | 12% 🛉 | | Silicon nitride | 20% 🕈 | 24% 🕈 | | Hafnium oxide | 32% ♠ | 43% | | Vacuum | 4% ↑ | 6% ↓ | Clearly as can be seen both vacuum and corner technologies devices are vastly superior to others and among the two, corner has a higher speed (lesser delay) than the vacuum device but also consumes more power. So based on one's priority one may choose between the two. #### IV. USING THE CHANNEL RGION OF THE MOS The channel of the MOS is the core driver of the MOS characteristics and here we make an attempt to exploit the channel properties, namely channel depth and doping to improve the transition region of the standard CMOS inverter. In other words we aim to push the 'slope' of the saturation region of the MOS device to as high a value as possible here enabling to obtain efficient switching for relatively smaller signal swings around a smaller switching voltage thereby leading to tremendous power savings. Fig. 3: A CMOS inverter utilizing wells.(below) Figure 7.10: Schematic of a CMOS inverter as processed on a p-type silicon substrate. The effect of NBTI mainly impacts the p-channel MOSFET (right hand side transistor). Note about the above figure - Fig 3: The p-channel MOSFET relies on an n-type substrate. As commonly p-type wafers are used for processing, an additional n-type well implant is necessary. In this well, which is a deep region of n-type doping, the p-channel MOSFET is placed. As the p-substrate and the n-well junction is reverse biased, no significant current flows between these regions and the two transistors are isolated. We aim to, for the CMOS transistor shown above, to reduce the "thickness" of the transition region as shown in the steady state degradation figure below and reduce the voltage level of input voltage where switching occurs. Fig. 4: Voltage characteristics of the inverter. Figure 7.14: Voltage transfer characteristics of the CMOS inverter without degradation. The transition from $V_{\rm out} = V_{\rm high}$ to $V_{\rm out} = V_{\rm low}$ is symmetric and very well centered around $V_{high}/2$ . The voltage transfer characteristics of the unstressed inverter can be seen in the figure. The transition from the 'on' to the 'off' state is very well aligned around $V_{\rm dd}/2$ . Negative Bias Temperature (NBT) stress has its highest impact on the p-channel MOSFET during low input $V_{\rm in} = V_{\rm low}$ . At this condition the transistor has a gate to substrate voltage of approximately - $V_{\rm dd}$ . When the circuit is additionally subject to thermal stress, then the threshold voltage of the p-channel transistor is degraded. As the n-channel device has a much lower susceptibility to this type of stress the circuit loses its symmetry. The switching point of the output potential moved to a lower input voltage. An interface trap density, which is already a severely damaged interface, reduces the switching point by more than 1V. Another method to achieve the same is to increase the doping concentration of the channel which will lead to a faster inversion. #### V. CONCLUSION Decreasing the parasitic capacitance improves switching speed and power consumption in CMOS technologies. Depending on the type of device needed we may choose the appropriate spacer technology. For high performance device, i.e. device wherein high speeds are desired, we choose corner spacer structure technology. On the other hand, for a low standby power consumption device, i.e. a device where speeds take lesser priority than power consumption, we can choose the vacuum spacer structure technology. If, however, both are of importance, corner spacer structure with vacuum and silicon oxide is the best option. We have also shown, graphically in the later part of the paper, that improving the saturation region of the voltage characteristics of the inverter gives us a device capable of running at a much lower power. #### REFERENCES - [1] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989. - [2] M.togo,et al.," A Gate-side Air-gap Structure(GAS) to Reduce the Parasitic Capacitance in MOSFETs", Symposium on VLSI technology Dig.,pg.38(1996) - [3] Jemin Park and Chemning Hu "Air Spacer MOSFET Technology for 20 nm Node and Beyond", ICSICT,p.53(2008) - [4] Lan Wei, Jie Deng, Li-Wen Chang, Keunwoo Kim, Ching-Te-ChuanWong.H.S.P. "Selective Device Structure Scaling and Parasitic Engineering: A Way to Extend the Technology Roadmap", IEEE Transaction on Electronic Devices Vol56-Issue 2(2009). - [5] Jemin Park and Chenming Hu "The Effects of Vacuum Spacer Transistors Between High Performance and Low-Stand-by Power Devices beyond 16nm", ICISCT ,pg.1823(Nov. 2010) ### An Overview of 2D to 3D Conversion Exploiting the Depth Information from Pictorial Cues #### Saurabh P. Supe, Kiran P. More, Prof. V. N. More Department of Electronics and Telecommunication Engineering, Govt. College Of Engineering, Pune, 411005 E-mail:- supesp10.extc@coep.ac.in , morekp10.extc@coep.ac.in , vnm.extc@coep.ac.in Abstract - Be it home entertainment, cinema or mobile phones, 3D applications are becoming popular day by day in our routine life. As a result there are requirements of newer technologies and techniques which would accelerate production and improve the quality of 3D multimedia and on the same time fulfill the rising demands of the consumers. On screen, a 3D video/image is an illusion of perception of depth to the human eye. Basically, binocular and monocular cues are two types of depth cues that are exploited by the human being to perceive the world in three dimensions. In this paper we take an overview of 2D to 3D conversion by the extraction of scene depth information by converting monocular depth cues contained in video sequences into quantitative depth values of a captured scene. The need of 2D to 3D conversion and certain related aspects are discussed in introduction in section one. The second section comprises of a detailed discussion of some recent techniques employed for depth extraction from a single image using pictorial cues. The stereoscopic or multi view generation by depth image based rendering using depth maps is discussed in the third section. The fourth section conveys the summary and outlook. #### I. INTRODUCTION Α 3D (three-dimensional) video or stereoscopic video is a motion picture that enhances the illusion of depth perception. Depth perception is the ability to see the world in three dimensions and to perceive distance. The images projected on each retina of a human eve are two dimensional. From these flat images, we construct a vivid three-dimensional world. To perceive depth, human brain depends on two main sources of information one is monocular cues, and other is binocular disparity. Monocular cues are cues to depth that are effective when viewed with only one eye, and these include, interposition, atmospheric perspective, texture gradient, linear perspective, size cues, height cues, and motion parallax [1]. Our eyes are spaced apart. The left and right retinas receive slightly different images. This difference in the left and right images is called binocular disparity. The brain integrates these two images into a single image, allowing us to perceive depth and distance, and thus adds a third dimension to the image. And nowadays, videos and images having this third dimension have triggered a great rising demand in market from the consumers. To meet the initial rising demand for 3D video content, it is unrealistic to rely only on the production of new 3D videos. This necessitates the development of an efficient 2D-to-3D conversion system, which would cut down the cost of 3D content creation and will allow consumers to enjoy their conventional DVD or Blu-Ray content etc. in stereoscopic 3D. The most fundamental technique in conversion from 2D to 3D is by sticking to the principle of binocular disparity, by using the original image as a left-eye view and to generate a new image as the right-eye view by horizontally shifting local regions of the original image, using a cut-and paste process. Using this method, stereoscopic depth can be created and any artifact in the new image would tend to be masked by the higher picture quality of the original image presented to the left eye [26]. However more laborious techniques are needed to deal with images that have multiple small objects, large areas with low textures, and gentle gradations of depth. A better solution of 2D-to-3D video conversion process requires depth map estimation. A depth map is an 8 bit grey scale image, in which grey level 0 indicates the furthest distance from the camera and grey level 255 indicates the nearest distance. If depth information could be derived from the original 2D video sequence, then 3D content can be generated in the format of stereo videos via a process known as depth image based rendering [3-4]. The extraction of scene depth information aims to convert monocular depth cues contained in video sequences into quantitative depth values of a captured scene. This value can be captured by using pictorial cues and motion cues in a picture. In this paper we will concentrate on pictorial cues as pictorial depth cues are the elements in an image that allow us to perceive depth in a 2D representation of the scene. The generation of depth information from pictorial cues embedded in an image can be subdivided into real and artificial depth information from available pictorial cues. By "real" signifies relative depths between objects in the scene. The second approach creates artificial depth information by exploiting pictorial cues that are commonly found in all scenes. Basically there are three categories of pictorial cues commonly used to extract depth information. These are Depth from focus/defocus (blur), depth from geometric cues and depth from color and intensity cues. Accommodation is the mechanism of the human eye used to focus on a given plane in depth. Real aperture cameras do similarly by focusing on a given plane. This in practice makes the rest of the scene blurred in a measure that depends on the distance to the focusing plane of the optics. This mechanism can be exploited for the generation of depth information from captured images, which contain a focused plane and objects out of the focused plane. This concept is known as depth-from-focus/defocus, which is one of the first mechanisms to be employed to recover the depth from single images [27][28]. Depth from geometric cues is an interesting approach to obtaining depths from a 2D image. Geometric related pictorial depth cues are linear perspective, known size, relative size and height in picture, interposition, and texture gradient. Variations in the amount of light arriving to the eye could also provide information of the depth of objects. This type of variation is reflected on captured images as variations of intensity or changes in color. Some recent and effective techniques exploiting pictorial cues for depth extraction from a single image are discussed further. ### II. DEPTH EXTRACTION USING PICTORIAL CUES Pictorial cues provide vital information using which a depth map of a single image can be computed. These cues are broadly classified into three categories which are depth from focus/defocus (blur), depth from geometric cues and depth from color and intensity cues. #### A. Depth extraction using defocus / blur level feature When an image is in focus, knowledge of the camera parameters can be used to estimate the depth of the object point. When the image is defocused, the structure can be recovered through an estimation of the defocus blur. In conventional methods using depth from focus or defocus, multiple images from fixed view point with different focal length are taken by the camera. Some of focal lengths to obtain the best focused image or defocus value are used, which also can be used to calculate object distance. In focus method, the images of object from different distances are used. In this method the changes in the object in focus distance, is compared to that in out of focus distance that caused the object to appear blurred; thus relative depth of objects can be estimated. Subsequently, by using the estimate of the blur, one can recover the depth information in the scene with the knowledge of the lens parameters. Here we discuss a method for obtaining depth using the defocus blur image from the objects located in unknown distance from the camera. The relationship between depth and defocus blur obtained depends on camera focus range and blur observed in the image. Using fixed camera parameters data for blur spot determined at unknown distance to the camera, the depth of an object located in every position can be calculated using depth-blurring function. Thus, this method is capable to estimate depth from one defocus image taken by camera. For any point in the object, therefore, one marking point will be recorded on image sequence. Now if the focus on point of the object is properly adjusted, that point is visualized and if the point is not in focus, the image is to be formed before or after the sensor and that marker point appears as a circle, such that the more the distance of formed point from image sequence, the larger the diameter of blur circle or circle of confusion. For a camera lens model with focal length f, the relationship between the position of a point in the scene and the corresponding focused position in the image is given by a well known lens formula: $$(1/f) = (1/p) + (1/q) \tag{1}$$ Where p is the distance of the object A from the lens on one side, and q is the distance of the image plane from the lens on the other side (Fig. 1). If we consider an object point B with a distance z from the lens, then Eq.(1) can be rewritten as: $$(1/f) = (1/z) + (1/z')$$ (2) Where z' is the distance of the virtual focus position from the lens. Furthermore, the corresponding image point of B is modeled as a *blur circle*. From Eq. (2) and the relation: $$(d/D) = [(q-z')/z']$$ (3) The diameter d of the blur circle is given by: $$D = Dq [(1/f)-(1/z)-(1/q)]$$ (4) where D is the diameter of the lens and q is the distance from the lens to the image plane. Since the distance q is generally unknown, substituting eq.(4) in eq.(1) yields: $$d = ((Dpf)/(p-f)) ((1/p) - (1/z))$$ (5) Fig. imaging model for defocus blur in camera image plane. That is, the size of the blur circle for any scene point located at a distance z from the lens can be calculated by Eq. (5). It is clear that the blur circle diameter d depends only on the depth z if fixed camera settings of D, f and P are given. Considering these fixed camera parameters eq.(5) can be easily rewritten as: $$C = (Df)/(p-f)$$ (6) Where, $$d = c - (cp / z)$$ (7) From these equations, the size of the blur circle is identified to be linearly related to distance of the object. Moreover, the closer the depth distance z to the camera focal, the smaller the diameter of blur circle. Information required for Eq. (6) can be directly obtained from the camera used. $$f\# = f/D \tag{8}$$ Considering fixed camera internal parameters, the amount of blurring of an object has direct relation with distance between object and camera or object depth. As seen from Fig.[] for fixed camera parameters, it was found out that the change in object distance from the camera has direct relationship with amount of defocus blur in object in the image. #### A. Depth Map Extraction Using Geometric and Texture Cues Here we discuss a framework to estimate depth map based on both geometric and texture cues extracted from Fig. Object images Sample with fixed camera parameters (focal length is considered 500mm) in different distance from camera, a) distance is 500 mm, b) distance is 1000 mm, c) distance is 1500 mm, d) distance is 2000 mm. a single image. The geometric cue computed from the extracted lines with respect to perspective geometry generates an initial depth-map; the texture cue obtained from image intensity based segmentation is utilized to refine the initial depth-map into a final depth-map for 2D to 3D image conversion. As shown in FIG a single image is used as input source of two different process chains: one is for extracting geometric cue i.e. vanishing point, based on line features and the other is for obtaining texture cue from intensity based segmentation i.e. superpixel[29]. The first stage of acquiring geometric cue is to obtain lines in the input image by applying the Hough transform [6]. As proposed in [7], the vanishing point is estimated by eq (9) in polar space. The vanishing point plays a critical role to estimate perspective geometry that gives rise to depth perception to human [8]. $$min_{x_0,y_0} \sum_{i=1}^{N} w_i (\rho_i - x_0 \cos\theta_i - y_0 \sin\theta_i)^2$$ (9) From the perspective geometrical view, we assume that the vanishing point is the farthest distant point in the input image. After obtaining the estimated vanishing Fig. System for depth map generation using geometric and texture cues. point, we now generate an initial depth-map based on Gaussian distribution with the estimated vanishing point as its mean. Among several segmentation techniques, we apply graph-based segmentation [9] for obtaining texture cue. Each segment is likely represented one single object in the image. We can generally assume that abrupt depth changes do not occur in the same segment but do between different segments. Finally the framework combines the geometric cue with the texture cue by assigning depth values, which are based on the initial depth-map, in each segment as same as possible. Experimentation shows that the vanishing point can lie outside the image like in [geo base paper] the computed vanishing point is located outside the image, at (-396,426). Fig. shows extracted lines for estimating vanishing point from eq.(9). The graph-based segmentation is depicted in Fig. 2 (c) and the refined depth-map is shown in Fig. 2 (d). #### 1. Depth map using edge information The edge of an image has a high probability of being the edge of the depth map. If pixels at similar intensity level are grouped together, a relative depth value can be assigned to each grouped region. **Fig.** shows a conversion system using edge information. The algorithm initially chooses an effective grouping method such that all the pixels of a block based image, lying in a certain group have similar colors and spatial locality. Initially a block based image is divided into a certain number of macroblocks of size 16-by-16. A single macroblock consists of 16, 4-by4 nodes. As seen from **Fig.** Each node is a 4-by-4 pixel block, and each node is four-connected. The value of each link is assigned as the absolute difference of the mean of neighboring blocks: $$Diff(a,b) = |Mean(a) - Mean(b)|$$ (10) Where, a and b denote two neighboring blocks, respectively, and Mean(a) represents the mean color of a. This value measures the similarity strength of neighboring blocks. A smaller value implies a higher similarity between the two blocks. Further, the blocks are segmented into multiple groups by using the minimum spanning tree segmentation. The links of stronger edges are then removed to generate multiple grouped regions. Thus the image is segmented into multiple groups, each having distinct intensity or color mean. The MST algorithm [12] identifies the coherence among the blocks with both the color difference and the connectivity without generating many small groups. Fig. 2D to 3D Conversion system using depth information. Fig. Block based region grouping. Fig. Depth map gradients left, left-down, bottom-up, right-down and right. After generating the block groups, the corresponding depth is assigned by the hypothesized depth gradient (**Fig. 5**) based on the linear perspective information. When each scene change is detected, the linear perspective of the scene is analyzed by a line detection algorithm using Hough transform [9]. The depth value of a given block group R is assigned by: Depth(R) $$= 128$$ $$+ \frac{255 \left\{ \sum_{pixel(x,y) \in R} \left( W_{rl} \frac{x - width/2}{width} + W_{ud} \frac{y - height/2}{height} \right) \right\}}{pixel\_num(R)}$$ Where, $|W_{rl}| + |W_{ud}| = 1$ . The above equation suggests that the assigned depth value is the gravity center of the block group, explaining why each block group belongs to the same depth. The depth map generated by block-based region grouping contains blocky artifacts. Here, the blocky artifacts are removed by using the cross bilateral filter [7][8], as expressed in the following equation: Depth<sub>f</sub>(x<sub>i</sub>) $$= \frac{1}{N(x_i)} \sum_{x_{i/O}x_i} e^{-0.5 \left(\frac{|x_j - x_i|^2}{\sigma_s^2} + \frac{|u(x_j) - u(x_i)|^2}{\sigma_r^2}\right)} Depth(x_j) \quad (12)$$ $$= \sum_{x_{j \in \Omega} x_{i}} e^{-0.5 \left( \frac{\left| x_{j} - x_{i} \right|^{2}}{\sigma_{S}^{2}} + \frac{\left| u(x_{j}) - u(x_{i}) \right|^{2}}{\sigma_{\Gamma}^{2}} \right)}$$ (13) Where u(xi) denotes the intensity value of the pixel $x_i$ , $\Omega(x_i)$ represents the neighboring pixels of $x_i$ $N(x_i)$ refers normalization factor of the filter coefficients, and $Depth_f$ is the filtered depth map. The window size depends on the block size configured in the block-based region grouping stage. The cross bilateral filter smoothens the depth map properly while preserving the object boundaries and a refined depth map is obtained as shown in **Fig.** #### 2. Stereoscopic generation from depth maps. We have studied different techniques of extracting depth information for every pixel from an image, and thus obtaining a depth map, in which the depth information is coded as luminance intensity level, usually lighter values for closer distances and darker values for farther distances. And as we have the original image and its computed depth map, we can generate a 3D or stereoscopic view using (DIBR) Depth Image Based rendering technique. As shown in **Figure 1**, lefteye and right eye images at virtual camera positions *cl* and *cr* can be generated for a specific camera-baseline indicated by *t*, *if* knowledge of the focal length, *f*, and the depth, Z, from the depth map is provided [3]. The geometrical relationship shown in **Figure 1** can be expressed mathematically as in Equation 1 and the extent of pixel shifting can be computed: $$x_l = x_c + \frac{t}{2} \frac{f}{Z}, \ x_r = x_c - \frac{t}{2} \frac{f}{Z}$$ (14) DIBR is particularly useful for multi view stereoscopic systems that typically require between eight and sixteen views of a visual scene. There has also been great interest in the use of DIBR with respect to the development of a practical 3D-TV system because the method allows for efficient transmission and storage [4][5]. #### **Summary and OUTLOOK** The fundamental principle of converting from 2D to 3D is to horizontally shift the pixels of an original image to generate a new version (left/right) of it, using the original image and its depth map. This paper summarizes the importance of pictorial cues and discusses some recent techniques employed for depth map generation using these pictorial cues. Edges are important features, which provide an indication of the shape of the objects in the image/video. Thus the study and analysis of object edges plays vital role in depth map extraction from an image utilizing its monocular cues. #### REFERENCES - [1] Liang Zhang, Carlos Vázquez and Sebastian Knorr, "3D-TV Content Creation: Automatic 2D-to-3D Video Conversion", IEEE transactions on broadcasting, vol. 57, no. 2, june 2011. - [2] Karsten Mu"ller, Philipp Merkle, and Thomas Wiegand, "3-D Video Representation Using DepthMaps", IEEE Proceedings vol.99 issue 4 pg.643-656, Apr 2011. - [3] By Aljoscha Smolic, Peter Kauff, Sebastian Knorr, Alexander Hornung, Matthias Kunter, Marcus Mu"ller, and Manuel Lang, "Three Dimensional Video post production and processing", IEEE Proceedings, vol.99<u>Issue4</u> pg. 607 625, Apr 2011. - [4] W. 1. Tam, A. Soung Yee, 1. Ferreira, S. Tariq, F. Speranza, "Stereoscopic image rendering based on depth maps created from blur and edge information," Proc. of Stereoscopic Displays and Applications XII, Vol. 5664, pp.104-115, 2005. - [5] S. H. Lai, C. W. Fu, S. Chang, "A generalized depth estimation algorithm with a single image," PAMI, Vol. 14(4), pp. 405-411,1992. - [6] M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, "Generating the Depth Map from the Motion Information of H.264-Encoded 2D Video Sequence," EURASIP Journal on Image and Video Processing, vol. 2010, Article ID 108584, 13 pages, 2010. - [7] Koschan, A. and Abidi, M., "Detection and classification of edges in color images", IEEE Signal Processing Magazine, vol.22, issue 1,pp.64-73, Jan 2005. - [8] J. F. Canny, "A computational approach to edge detection", IEEE Trans. Pattern Analysis and Machine Intelligence, 8 (6), pp.679-698, 1986. - [9] D. Kim, D. Min, and K. Sohn, "A Stereoscopic Video Generation Method Using Stereoscopic Display Characterization and Motion Analysis," in IEEE Trans. On Broadcasting, Vol. 54, Issue 2, pp. 188-197, 2008. - [10] Y.-L. Chang, et al, "Depth map generation for 2D-to-3D conversion by short-term motion assisted color segmentation," in Proc. ICME, 2007. - [11] Chao-Chung Cheng, Chung-Te Li and Liang-Gee Chen, "A Novel 2D to 3D Conversion System Using Edge Information", IEEE Transactions on consumer electronics, vol.56 issue 3, pg. 1739-1745, Oct 2010. - [12] G. Economou, V. Pothos and A. Ifantis, "Geodesic distance and MST based image segmentation", in Proc. European Signal Processing Conf, 2004. - [13] ITU-R Recommendation BT.500-10, (2000) "Methodology for the subjective assessment of the quality of television pictures." - [14] C.-C. Cheng, C.-T. Li, P.-S. Huang, T.-K. Lin, Y.-M. Tsai, and L.-G. Chen, "A block-based 2D- - to-3D conversion system with bilateral filter," in Proc. IEEE Int. Conf. Consumer Electronics, 2009. - [15] C.-C. Cheng, C.-T. Li, and L.-G. Chen, "A Novel 2D-to-3D conversion system using edge information," in Proc. IEEE Int. Conf. Consumer Electronics, 2010. - [16] S. Paris and F. Durand, "A fast approximation of the bilateral filter using a signal processing approach," in MIT Technical Report (MITCSAIL- TR-2006-073), 2006. - [17] C. Fehn, "Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TV," in SPIE Conf. Stereoscopic Displays Virtual Reality Syst. XI, CA, Jan. 2004, vol. 5291, pp. 93–104. - [18] K. Luo, D.-X. Li, Y.-M. Feng, and M. Zhang, "Depth-aided in painting for disocclusion restoration of multi-view images using depthimage-based rendering," J. Zhejiang Univ. –Sci. A, vol. 10, no. 12, pp. 1738–1749, Dec. 2009. - [19] G. Zhang, J. Jia, T. T. Wong, and H. Bao, "Consistent depth maps recovery from a video sequence," IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 6, pp. 974–988, 2009. - [20] J. Kim, A. Baik, Y. J. Jung, and D. Park, "2D-to-3D conversion by using visual attention analysis," in Proc. SPIE 7524, Feb. 2010, 752412. - [21] S. Knorr, E. Ýmre, A. A. Alatan, and T. Sikora, "A geometric segmentation approach for the 3D reconstruction of dynamic scenes in 2D video sequences," in EUSIPCO, Florence, Italy, Sep. 2006. - [22] X. Huang, L.Wang, J. Huang, D. Li, and M. Zhang, "A depth extraction method based on motion and geometry for 2D to 3D conversion," in 3<sup>rd</sup> Int. Symp. Intell. Inf. Technol. Appl., 2009, pp. 294–298. - [23] C. Vázquez, W. J. Tam, and F. Speranza, "Stereoscopic imaging: Filling disoccluded areas in depth image-based rendering," in SPIE Conf. 3–D TV, Video, Display V, 2006, vol. 6392, 63920D. - [24] L. J. Angot, W. J. Huang, and K. C. Liu, "A 2D to 3D video and image conversion technique based on a bilateral filter," in Proc. IS&T/SPIE Electron. Imaging, 2010, vol. 7526, pp. 75260D1–75260D10. - [25] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera, "Filling-in by joint interpolation of vector fields and gray levels," IEEE Transactions on Image Processing, vol. 10, pp. 1200-1211, 2001. - [26] L. B. Stelmach, W. J. Tam, D. Meegan, & A. Vincent, "Stereo image quality: Effects of mixed spatio-temporal resolution", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10(2), pp. 188-193, 2000. - [27] J. Ens and P. Lawrence, "An investigation of methods of determining depth from focus," IEEE Trans. Pattern Anal. Mach. Intell., vol. 15, no. 2, pp. 523–531, 1993. - [28] J. Ko, M. Kim, and C. Kim, "2D-To-3D stereoscopic conversion: depth-map estimation in a 2D single-view image," in Proc. SPIE, 2007, vol. 6696. - [29] X. Ren and J. Malik, "Learning a classification model for segmentation", 9th Int. Conf. Computer Vision, Vol. 1, 2003. - [30] Tayebeh Rajabzadeh, Abedin Vahedian, Hamidreza Pourreza, "Static Object Depth Estimation Using Defocus Blur Levels Features", 978-1-4244-3709-2/10/2010 IEEE. - [31] P.V. Hough, "Methods and means to recognize complex patterns," U.S Patent 3,069,654, 1962. - [32] V. Cantoni, L. Lombardi, M. Potra, and N. Sicard, "Vanishing point detection: Representation analysis and new approaches," 11th Int. Conf. on Image Analysis and Processing, 2001. - [33] R. Hartley and A. Zisserman, Multiple View Geometry in computer vision. Cambridge University Press, 2000. - [34] P. F. Felzenszwalb and D.C.-C. Cheng, C.-T. Li, P.-S. Huang, T.-K. Lin, Y.-M. Tsai, and L.-G. Chen, "A block-based 2D-to-3D conversion system with bilateral filter," in Proc. IEEE Int. Conf. Consumer Electronics, 2009. - [35] P. Huttenlocher, "Efficient Graph- based Image Segmentation," Int. J. of Computer vision, vol. 59, Sep. 2004. - [36] Kyuseo Han and Kihyun Hong, "Geometric and Texture Cue Based Depth-map Estimation for 2D to 3D Image Conversion", IEEE International Conference on Consumer Electronics, 978-1-4244-8712-7/11, IEEE 2011. - [37] L. Zhang & W. J. Tam, "Stereoscopic image generation based on depth images for 3D TV," IEEE Transactions on Broadcasting, Vol. 51, pp. 191-199, 2005. - [38] P. Harman, "Home based 3D entertainment—An overview," IEEE Conference on Image Processing, Vol. 1, pp. 1-4, 2000. - [39] C. Fehn, "Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TV," Stereoscopic Displays and Virtual Reality Systems XI, Vol. 5291, pp. 93-104, 2004. # Design and simulation of circularly polarized compact Microstrip patch antenna for ISM-band LAN application # Priyanka & Navin Srivastava BVU, College Of Engineering, Dhanakwadi, Pune, India-411043 E-mail: Priyanka 2 2@yahoo.co.in, nksrivastava@bvucoep.edu.in Abstract— A probe feed, slotted hexagonal patch antenna has been proposed. Bandwidth enhancement has been improved by suitably cutting slots into hexagonal patch. Proposed antenna is suitable for various telecoms, LAN, Wi-Fi applications in ISM-band. It is demonstrated that the proposed antenna exhibits resonance in ISM-Band and a pea gain of 6dBi.The antenna structure is described and simulated results are presented. Keywords:Microstrippatchantenna, Bandwidthenhancement, Gain, Dielectric substrate, simulation: #### I. INTRODUCTION In rapidly expanding market for wireless communication and applications, Micro strip antenna has become widely popular as it is low profile, comfortable to the hosting surfaces, light weight and can be easily integrated with the electronic circuits. Microstrip antenna is widely used in military, mobile communication, global positioning system (GPS), remote sensing etc. Taking benefits of added processing power of today's computers HFSS(High Frequency Structure Simulator) simulator are emerging to perform planer and 3D analysis of high frequency structure. HFSS simulator has long been an essential modeling tool for RF/Microwave design. Proposed antenna is designed and simulated on HFSS simulator software. Microstrip patch antenna in general consists of a radiating conducting patch printed on a grounded dielectric substrate. The patch is a very thin metal disk. To overcome its limitation of narrow bandwidth by generating more than one resonant frequencies, many techniques have been suggested in the past e.g. different shaped slots[2-4],stack, multilayer[6],two folded parts to the main radiated patch and use of air substrate have been proposed and investigated. In the design presented in this paper slotting of the radiating patch has been used because as compared to the other techniques slotting offers the promise of saving space while giving good performance if done appropriately. The advantages of microstrip antenna have made them a perfect candidate for use in the wireless local area network (WLAN) applications. Though bound by certain disadvantages microstrip patch antenna can be tailored so they can be used in the new high speed broadband WLAN system. This paper concentrates on manufacture of broadband micro strip patch antennas for 2.4GHz ISM –band. It is now both possible and affordable to surf the web from your laptop without any wire connectivity and while enjoying cricket match on your television. A WLAN is a flexible data communication network used as an extension to or an alternative for a wired LAN in a building. As a result the demand has been increased for broadband WLAN antenna that meets all the desired requirements. The broadband antenna are required to be compact, low profile directive for high transmission efficiency and designed to be discreet, due to these well met requirements couple with the ease of manufacture and repeatability makes the micro strip patch antennas very well suited for broadband wireless applications. #### II. ANTENNA GEOMETRY AND DESIGN PROCEDURE A microstrip patch antenna is a radiating patch on one side of a dielectric substrate which has a ground plane underside (fig1). The EM wave fringe off the top patch of the substrate, reflecting off the ground plane and radiates out into the air. Radiation occurs mostly due to the fringing field between the patch and ground. The radiation efficiency of the patch antenna depends Fig1:- microstrip patch antenna largely on the permittivity of the dielectric ( $\epsilon$ ). Ideally, a thick dielectric, low $\epsilon$ and low insertion loss is preferred for broadband purpose and increased efficiency. As shown in fig2 the antenna has hexagonal patch structure. The dielectric chosen is FR4-epoxy substrate having relative permittivity of 4.4 and the thickness of 1.53mm. The dimension of patch is approximated by using basic design approach described for microstrip patch antenna as listed below. WIDTH OF PATCH $$W = \frac{c}{2f_0\sqrt{\frac{\left(\varepsilon_r + 1\right)}{2}}}$$ EFFECTIVE DIELECTRIC CONSTANT $$\varepsilon_{r_{\text{eff}}} = \frac{\left(\varepsilon_r + 1\right)}{2} + \frac{\left(\varepsilon_r - 1\right)}{2} \left[1 + 12\frac{h}{W}\right]^{\frac{-1}{2}}$$ EFFECTIVE LENGTH $$L_{\rm eff} = \frac{c}{2f_{\rm 0}\sqrt{\varepsilon_{\rm reff}}}$$ LENGTH EXTENTION $$\Delta L = 0.412h \frac{\left(\varepsilon_{\mathit{reff}} + 0.3 \left(\frac{w}{h} + 0.264\right)\right)}{\left(\varepsilon_{\mathit{reff}} - 0.258 \left(\frac{w}{h} + 0.8\right)\right)}$$ ACTUAL LENGTH OF PATCH $$L = L_{\rm eff} - 2\Delta L$$ Where: $c = 3x10^{18} \,\text{m/s}.$ h = height of substrate. $\varepsilon_r$ = dielectric constant of the substrate This is the proposed antenna geometry having two slots of length I and width d is designed on the top of the hexagonal patch. The slot antenna is excited by coaxial probe feed which is situated at 45 degree of x-y plane. Fig2:-Main design with hexagonal patch #### III SIMULATION SETUP The proposed antenna is has been modeled by HFSS 11 simulator software. My primary use of the ANSOFT HFSS software is to design and simulate electrically small antennas. The patch antenna is created by three bricks: First for the radiating plate second for the coaxial probe feed and third for the substrate. The ground plane is specified by a perfect electrically conducting boundary condition. The coordinates are assigned as follows. The starting point for the ground plane is (-25,-25, 0) and for the substrate is (-25,-25, 0.1). The length L of the substrate and the width are chosen as 50mm and 50mm respectively. Patch is the half that of the ground and substrate is equal to the length $\sim \lambda 0/2 \sim 12.45$ mm and the width of the patch is half that of the ground and substrate = 25mm. However, the patch's starting point is at (15, 0, 1.63). The patch's starting point is located at of the total width of the ground plane and stretches its length 12.45 mm in the positive x-direction. Moreover, the feed should be located somewhere between the values of the starting point and ending point of the patch. Although the feed is located at the midpoint, we have to shift the feed a little to the upper or lower side to achieve impedance matching. # **Drawing the Model** Drawing the model would now be easy. All we have to do is to fill in the coordinates. The 3D capabilities of HFSS make it an ideal candidate for this purpose, since the entire fine geometrical and electrical details of the antennas can be included in the solver. From my prior experience, this leads to a very accurate reproduction of the simulated performance in the experimental testing, thus avoiding costly re-makes of the fabricated prototypes. Furthermore, the ability to visualize the different field quantities within the solved designs, in conjunction with the HFSS fields calculator, greatly assists in the initial design of antennas. The Finite Element Method allows the user much better than other Methods to model any arbitrary shaped 3D structure. The HFSS tetrahedron mesh represents accurately even curved surfaces and highly detailed models. By using an adaptive meshing process the results may be obtained to userspecified accuracy with no engineering effort or manual intervention with the mesh process. This is particularly important because the engineer does not have to be familiar with the process of creating a mesh and does not need to put a higher amount of effort into the mesh creation. This process has repeatedly proven to be of significantly greater accuracy than non-adaptive solution processes and saves a tremendous amount of engineering time dependent characteristics. A number of frequency dependent material models are available. First, let's start by drawing the ground plane by referring to the coordinates of the Ground Plane, Second, let's draw the substrate. The substrate has the same dimensions of the ground plane. Third, we create the patch. Knowing that the patch should be in the center, the coordinates are easy to calculate from the ground and substrate dimensions. Fourth, create the microstrip feed. The microstrip feed position is chosen somewhere between the ends of the patch. In reality it should be at the middle, however, it is shifted more to one side of the patch for impedance matching purposes. Assign Boundary:- Now the model has been created, we need to assign boundary conditions. In HFSS, radiation boundaries are used to simulate open problems that allow waves to radiate infinitely far into space. HFSS absorbs the wave at the radiation boundary. The boundary condition should satisfy a certain distance from the antenna. Normally, its value is chosen between $\lambda/8$ to $\lambda/12$ , where $\lambda$ is calculated from $\lambda=C/F$ , where c is 3 x 108 m/s and f is the frequency in (Hz). Assign Excitation:- Having the entire model set now, the only missing part is the excitation. The excitation is a sheet which is at the end point of feed. The Antennas are excited through the wave port. We need to create the wave sheet. #### IV. SIMULATION RESULT AND DISCUSSION The proposed antenna is had been modeled at the design frequency of 2.8GHz.It can be seen that the axial ratio in the broadside direction is below 3 db throughout a bandwidth of around 8MHz.The required axial ratio has been calculated using formula given below. $$AR(dB) = 20\log\frac{E_R + E_L}{E_R - E_L}$$ The proposed antenna posses an average gain of 6db.return loss is around -17db. Return loss:- a good antenna might have a value of -10db return loss as 90% of signal is absorbed and 10% is reflected back.the proposed antenna is giving the exelent return loss in s-band.the curve has a deep curve at centre frequency 2.85GHz. Fig3:-Return loss of proposed patch antenna Fig4:-Axial ratio of proposed antenna The axial ratio of the antenna is showing good circular polarization which is of 1dBi.fig4 shows the simulated axial ratio vs frequency graph. Fig5:Peak gain of proposed antenna The peak gain of the antenna measured at resonating frequency points by comparison method. Figure.5 shows the measured antenna gain versus frequency where as 2.8GHz frequency band is approximately 6dBi. Fig6:-Radiation pattern of proposed antenna The simulated and measured radiation patterns of the proposed antenna operating at 2.8GHz are shown in figure 6. It is found that the antenna has relatively stable radiation patterns over its operating band; a near Omni directional pattern is obtained #### III. CONCLUSION With the help of HFSS11 software simulator a wideband and high gain CP-MSP antenna is designed. Slots are incorporated on hexagonal patch side. It can be fabricated and analyzed further due to simple structure. The antenna is successfully matches the desired characteristic of return loss less than 10db, axial ratio less than -3db, and peak gain of 6dbi.The simulated result shows that antenna exhibits good electrical performance and thus can be considered as a suitable candidate for various applications in ISM-band. Hence the proposed antenna is suitable for wireless local area network (WLAN) and multichannel multipoint distribution service (MMDS) WiMAX communication applications. #### ACKNOWLEDGEMENT I wish to acknowledge Dr. Rajkumar sir (Scientist 'E') of DIAT (Defense Institute of Advanced Technology) college for providing me sufficient knowledge of designing antenna and permitting me to use of lab facilities. #### REFERENCES [1] User manual guide HFSSv11. [2] sanad double c-patch antenna having different aperture shapes".ieee proceeding on antenna and propagation,pp.2116-2119,june1995 [3] H.F.Abutarbaush,H.S.Al-Raweshiday,R.Nilavalan,"triple band double U-slot patch antenna for wimax mobile application".APCC08. [4] C.L.Mac,R.Chair,K.F.Lee,K.M.Luk and A.A.Kishk"half U-slot patch antenna with shorting wall", electronic letters vol 39,pp 1779-1780,2009 [5]H.F.Abutarbaush,H.S.A1Raweshidav,R.Nilavalan"Multiba nd antenna for different wireless applications" IEEE International workshop on antenna, March 2009 [6]B.Sanzizquierdo, J.C.Bachlor, R.J.Langley, M.I.Sobhy" single and double layer planer multiband PIFA's", 2006. [7] Y.L.Kuo and K.L.Wong, "Priented doble –T monopole antenna for 2.4/5.2 GHz dual band WLAN operations", IEEE Transaction antenna and propagation, vol.51,no 9,pp.2187-2192,sept.2003. [8] Suma,M.N., Raj.R.K, Joseph.M, Bybi.P.C, Mohanan.p, "A Compact dual band planar branched monopole antenna for DCS/2.4 GHz WLAN applications". Microwave and wire less components letters, IEEE,vol.16,issue.5,pp 275-277,2006. # Comparasion of Energy Detection in Cognitive Radio over different fading channels # Simar Buttar ECE, Lovely Professional University, Phagwara, Punjab, India E-mail: Simar buttar@yahoo.com Abstract-With the advance of wireless communications, the problem of bandwidth scarcity has become more prominent. Cognitive radio technology has come out as a way to solve this problem by allowing the unlicensed users to use the licensed bands opportunistically. To sense the existence of licensed users, many spectrum sensing techniques have been devised. In this paper, energy detection and cyclic prefix is used for spectrum sensing. The comparison of ROC curves has been done for various wireless fading channels using squaring and cubing operation, the improvement has gone as high as up to 0.6 times for AWGN channel and 0.4 times for Rayleigh channel as we go from squaring to cubing operation in an energy detector. Closed form expressions for Probability of detection for AWGN and Rayleigh channels are described. Nakagami fading channel shows worst results. Keywords: Spectrum Sensing, Cognitive Radio, Probability of detection, Cooperative Detection. #### I. INTRODUCTION Today, by unprecedented growth of wireless applications, the problem of spectrum scarce is becoming more and more apparent. Most of the spectrum has been allocated to specific users, while other spectrum bands that haven't been assigned are overcrowded because of overuse. However, most of the allocated spectrum is idled in some times and locations. The Federal Communication Commission (FCC) research report [1] reveals that, seventy percent of the allocated spectrum is underutilized. So we need a technique to deal with the problem of spectrum underutilization, which makes the birth of cognitive radio. Cognitive radio [2][3]can sense external radio environment and learn from past experiences. It can access to unused spectrum band dynamically without affecting the primary users, in such a way to improve the spectrum efficiency. Sensing external radio environment quickly and accurately plays a key role in cognitive radio. Spectrum sensing includes the detection of primary users and secondary users in other cognitive networks in the same region, but most of papers on spectrum sensing only consider the detection of primary users. In this paper, we consider the cyclic prefix, a special feature embedded in the OFDM (Orthogonal Frequency Division Multiplexing) signals; is used to detect the presence of primary user's signal and is considered to be better than energy detection and matched filter detection as it performs well even in the fading channels. In addition, cooperative detection is used among the secondary users to improve the performance of spectrum sensing. Energy detector based approach, also known as radiometry or periodogram, is one of the popular methods for spectrum sensing as it is of non-coherent type and has low implementation complexity. In addition, it is more generic as receivers do not require any prior knowledge about the primary user's signal [4]. In this method, the received signal's energy is measured and compared against a predefined threshold to determine the presence or absence of primary user's signal. Moreover, energy detector is widely used in ultra wideband (UWB) communications to borrow an idle channel from licensed user. Detection probability $(P_d)$ , False alarm probability $(P_f)$ and missed detection probability $(P_m)$ are the key measurement metrics that are used to analyze the performance of an energy detector. The performance of an energy detector is illustrated by the receiver operating characteristics (ROC) curve which is a plot of Pd versus Pf or Pm versus Pf [5]. This paper is organized as follows: Section 2 describes the OFDM (Orthogonal Frequency Division Multiplexing) System Model. Section 3 and 4 describe the expressions for probability of detection for AWGN (Additive White Gaussian Noise) and Rayleigh channels respectively. Simulation Results for Cyclic Prefix and energy detection Based Spectrum Sensing over AWGN (Additive White Gaussian Noise) and Rayleigh channels and improvement using cooperative detection are presented in section 5 followed by conclusions in section 6. #### II. OFDM SYSTEM MODEL Fig. 1. Simplified Block Diagram of OFDM Transmitter Consider a block of data symbols mapped on to the subcarriers is represented by: $$\{s(0), s(1), s(2), \dots, s(T_d - 1)\}$$ The IFFT (Inverse Fast Fourier Transform) operation converts these frequency domain signals into timedomain signals and the time domain signals are represented by: $${x(0), x(1), x(2) ..., x(T_d - 1)}$$ where IFFT block size is assumed to be Td. Last Tc symbols of each block are added to the beginning of each block as cyclic prefix and the transmitted signal becomes: $$\{x(-T_c), \dots, x(-1), x(0), x(1), \dots, x(T_d - T_c), \dots, x(T_d - 1)\}$$ where the block of symbols $\{x(-T_c), ..., x(-1)\}$ is an exact copy of $$\{x(T_s - T_s), \dots, x(T_s - 1)\}\ \text{i.e.}\ x(t) = x(T_s + t)_{\text{i.e.}}$$ where $t \in [-T_c, -1]$ Now, the relation between the signals before and after the IFFT block can be expressed by the following expression [10]: $$x(t) = \frac{1}{\sqrt{T_d}} \sum_{n=0}^{T_d-1} s(n) e^{\frac{j \pi n(t-T_c)n}{T_d}} , t = 0,1, \dots T_d - 1$$ (1) A transmitted OFDM frame may contain several such blocks. Let denote the symbols of the transmitted OFDM frame. Detection is based on two hypotheses [5]: $$H_0: r(t) = n(t) (2)$$ and $$H_1: r(t) = y(t) + n(t)$$ (3) where r(t) is the received signal, n(t) is the additive white Gaussian noise [6]. $H_0$ represents the hypothesis when the signal is absent and only noise is present. $H_1$ represents the hypothesis when both signal and noise are present. Let $\chi$ is a measure of correlation between two samples distance Td apart [10]. $$\chi = \sum_{t=1}^{W} \frac{r(t)r^*(t+T_d)}{E[|r(t)|^2]} \tag{4}$$ For CP (Cyclic prefix) OFDM signal, the statistic $\chi$ under the above two hypothesis can be expressed as [10]: $$H_0: \chi = \sum_{t=1}^{W} \frac{n(t)n^*(t+T_d)}{\varepsilon[[n(t)]^2]}$$ (5) And $$H_1: \chi = \sum_{t=1}^{W} \frac{(y(t)+n(t))(y^*(t+T_d)+n^*(t+T_d))}{E[[y(t)+n(t)]^2]}$$ (6) # III.PROBABILITY OF DETECTION AND FALSE ALARM IN ENERGY DETECTION # A) In AWGN Channel Probability of detection Pd and false alarm Pf can be evaluated respectively by [11]: $$P_2 = P(Y' > A|H_1)$$ $$P_f = P(Y' > \Lambda | H_0)$$ where $\lambda$ is the decision threshold. Also, can be written in terms of probability density function as $$P_{f} = \int_{\Lambda}^{\infty} f_{Y'}(y) \, dy$$ $$P_f = \frac{1}{2^d \Gamma(d)} \int_A^\infty y^{d-1} e^{-(\frac{y}{2})} dy$$ Dividing and multiplying the R.H.S. of above equation by $2^{d-1}$ , we get $$P_{f} = \frac{1}{2\Gamma(d)} \int_{A}^{\infty} \left(\frac{y}{2}\right)^{d-1} e^{-(\frac{y}{2})} dy$$ Substituting $\frac{y}{2} = T$ , $\frac{dy}{2} = dt$ and changing the limits of integration to, we get $$P_f = \frac{1}{\Gamma(d)} \int_{A/2}^{\infty} (t)^{d-1} e^{-(t)} dt$$ $$P_{f} = \frac{r(d_{i}\Lambda/2)}{r(d)}$$ where $\Gamma$ (.) is the incomplete gamma function [13]. Now, Probability of detection can be written by making use of the cumulative distribution function $$P_d = 1 - F_{Y'}(A)$$ The cumulative distribution function (CDF) of can be obtained (for an even number of degrees of freedom which is in our case) as $$F_{\gamma'}(y) = 1 - Q_d(\sqrt{\lambda}, \sqrt{y})$$ $$P_d = Q_d(\sqrt{\lambda}, \sqrt{\Lambda})$$ $$P_d = Q_d(\sqrt{2\gamma}, \sqrt{\Lambda})$$ # B) In Rayleigh Channel Probability density function for Rayleigh channel is $$f(\gamma) = \frac{1}{\gamma} \exp\left(\frac{-\gamma}{\gamma}\right) \qquad \gamma \ge 0$$ The Probability of detection for Rayleigh Channels is obtained by averaging their probability density function over probability of detection for AWGN Channel $$P_{d,R} = \int_0^\infty P_d f(\gamma) d\gamma$$ where Pd,r is the probability of detection for Rayleigh channel. $$P_{d,\bar{\kappa}} = \frac{1}{\bar{\gamma}} \int_{0}^{\infty} Q_{d}(\sqrt{2\gamma}, \sqrt{\Lambda}) \exp\left(\frac{-\gamma}{\bar{\gamma}}\right) d\gamma$$ Now, substituting $\sqrt{y} = x$ , $y = x^2$ , dy = 2xdx $$P_{d,R} = \frac{2}{\bar{\gamma}} \int_0^\infty x. \, Q_d \left( \sqrt{2} x, \sqrt{\Lambda} \right) \exp \left( \frac{-x^2}{\bar{\gamma}} \right) dx$$ Probability of detection for Rayleigh channel can be expressed as $$P_{d,\mathcal{R}} = e^{(-\Lambda/2)} \sum_{n=0}^{d-2} \frac{1}{n!} \left(\frac{\Lambda}{2}\right)^n + \left(\frac{1+\overline{\gamma}}{\overline{\gamma}}\right)^{d-1} \left[\exp\left(-\frac{\Lambda}{2(1+\overline{\gamma})}\right) - \exp\left(-\frac{\Lambda}{2}\right) \sum_{n=0}^{d-2} \frac{1}{n!} \left(\frac{\Lambda\overline{\gamma}}{2(1+\overline{\gamma})}\right)^n\right]$$ C) Probability of detection in Nakagami channel-m fading $$P_{d,1}$$ for m $\geq 1/2$ ; $(\eta=1+\frac{m}{\gamma})$ $P_{d,1}=1-e^{-\frac{\lambda}{2}}(\frac{m}{\gamma\eta})^m\sum_{u}^{\infty}(\frac{\lambda}{2})^n n 1/n! F_1 (m;n+1;\frac{\lambda}{2\eta})$ Average detection probability over Nakagami-m fading with i number of EGC branches $(P_{d,t})$ for $m \ge 1/2$ $$\begin{split} \overline{P}_{d,2} &= 1 - \sqrt{\pi} e^{-\frac{\lambda}{2}} \sum_{n=u}^{\infty} \sum_{k=0}^{\infty} \left(\frac{\lambda}{2}\right)^{n} \left(\frac{2m}{\overline{\gamma} + 2m}\right)^{2m+k} \\ &\times \frac{4 \psi_{2}(m,n,k)}{2^{4m+k} \ k!} \ _{1}F_{1} \left(2m + k; n + 1; \frac{\lambda \overline{\gamma}}{2 \left(\overline{\gamma} + 2m\right)}\right) \\ \overline{P}_{d,3} &= 1 - \sqrt{\pi} e^{-\frac{\lambda}{2}} \sum_{n=u}^{\infty} \sum_{p=0}^{\infty} \sum_{k=0}^{\infty} \left(\frac{\lambda}{2}\right)^{n} \left(\frac{3m}{\overline{\gamma} + 3m}\right)^{3m+p+k} \\ &\times \frac{8 \psi_{3}(m,n,p,k)}{2^{4m+p+k} \ k!} \ _{1}F_{1} \left(3m + p + k; n + 1; \frac{\lambda \overline{\gamma}}{2 \left(\overline{\gamma} + 3m\right)}\right) \end{split}$$ #### **IV.SIMULATION RESULTS** The performance of energy detector is analysed using ROC (Receiver operating characteristics) curves for fading channels. Monte-Carlo method is used for simulation. It can be seen in the following figures that with increase in SNR (Signal to Noise Ratio), the performance of energy detection improves. FIGURE 2 and FIGURE 4 illustrates the ROC curves using squaring operation for AWGN and Rayleigh channel respectively. FIGURE 3 and FIGURE 5 depicts improvement in the performance of energy detector using cubing operation over AWGN and Rayleigh channel respectively. We assume time-bandwidth product=5. FIGURE 2: Complementary ROC Curves for AWGN using Squaring operation FIGURE 3: Complementary ROC for Rayleigh using Squaring operation. FIGURE 4:ROC curves using cubing operation in Energy Detection over AWGN channel FIGURE 5:ROC curves for cubing operation of Energy Detection over Rayleigh channel FIGURE. 5. Comparison of plots for Probability of detection versus signal to noise ratio (SNR) over AWGN and Rayleigh Channel. FIGURE 6:Energy detection over nakagami fading channel FIGURE7: ROC curves of Energuy Detection over rician channel # V.CONCLUSION: In the present work energy detection based spectrum sensing is analysed over different wireless fading channels. Closed form expressions for probability of detection for AWGN, Nakagami and Rayleigh channels are described. Using ROC (Receiver Operating Characteristics) Curve, it has been shown that Nakagami shows worst results. The comparison of ROC curves has been done for various wireless AWGN fading channel using squaring and cubing operation, the improvement has gone as high as up to 0.6 times for AWGN channel and 0.4 times for Rayleigh channel as we go from squaring to cubing operation in an energy detector #### ACKNOWLEDGEMENT Words are often too less to reveal one's deep regards. An understanding of the work like this is never the outcome of the efforts of a single person. I take this opportunity to express my profound sense of gratitude and respect to all those who helped me in this duration of dissertation. First of all, I would like to thank the supreme power 'the all mighty god' and my parents who has always guided me to work on the right path of the life. Without their grace this would never turn into reality. I would like to express my deep sense of gratitude toward my guide Ms. Komal Arora, Assistant Professor, Lovely Professional University, Phagwara who provided me all facilities and resources required for this work. I would also like to thank all the faculty of the department and my few best friends who helped me in this work directly or indirectly. #### REFERENCES - [1J. Ma, G. Y. Li, B.H. Juang. "Signal Processing in Cognitive Radio." Proceedings of the IEEE, vol. 97, pp. 805-823, May 2009. - [2] H. A. Mahmoud, T. Yucek and H. Arslan. "OFDM For Cognitive Radio-Merits and Challenges." IEEE wireless communications, vol. 16, pp. 6-15, April 2009. - [3] I. F. Akyildiz, W.Y.Lee, M.C. Vuran, S. Mohanty. "Next Generation/Dynamic Spectrum Access/Cognitive Radio Wireless Networks: A Survey." Comp. Net. J., vol. 50, pp. 2127–59, Sept. 2006. - [4] H. Urkowitz. "Energy detection of unknown deterministic signals." Proc. IEEE, vol. 55, pp. 523–531, April 1967. - [5]S. Atapattu, C. Tellambura, and H. Jiang. "Energy detection of primary signals over η-μ fading channels." in Proc. Fourth International Conference on Industrial and Information Systems, ICIIS, 2009, pp. 118-122. - [6] L. Yu, L.B Milstein, J.G Proakis, B.D. Rao, S.P. Bingulac, "Performance Degradation Due to MAI in OFDMA Based Cognitive Radio," IEEE International Conference on Communications (ICC), pp. 1-5, May 2010, doi:10.1109/ICC.2010.5501830. - [7] Z. L. Chin, F, "OFDM Signal Sensing for Cognitive Radios," Proc. IEEE Symp. Personal, Indoor and Mobile Radio Communications (PIMRC '08)), pp. 1-5, Sept. 2008, doi:10.1109/PIMRC. 2008. 4699404. - [8] J. Mitola and G. Q. Maguire, "Cognitive Radio: Making Software Radios More Personal," IEEE Personal Communications, vol. 6, no. 4, pp. 13-18, Aug 1999, doi:10.1109/98.788210. - [9] Goh, L. P. Lei, Z. Chin, Francois, "Feature Detector for DVB-T Signal in Multipath Fading Channel," Proc. Second International Conference on Cognitive Radio Oriented Wireless Networks and communications CrownCom200,pp.234-240,2007,doi:10.1109/CROWNCOM.2007.4549802. - [10] Herath S.P. Asian Inst. of Technology, Rajatheva, N; Tellambura, C, may 2009. On the energy detection of unknown deterministic signal over Nakagami channelswith selection combining. # Implementing SHA-224/256 Algorithm for Secure Commitment Scheme Applications using FPGA #### V. Venkata Sai Karthik & T. Venkata Sridhar Dept. of ECE, Audisankara College of Engineering & Technology, Nellore Dt, AP, India E-mail: vsaikarthik5712@gmail.com, venkatasridhar.ece@audisankara.com Abstract- This paper uses the similarity between SHA-224 and SHA-256 algorithms to design the SHA-224/256 IP core oriented Digital Signature. The IP core uses parallel structure and pipeline technology to simplify the hardware design and improve the speed by 26%. Finally this IP core is implemented on the Altera's FPGA EP2C20F484C6 chip. And its simulation result can run rightly under the 100MHz frequency. This IP core can be widely used in the data integrity and consistency verification, pseudo random number generation and other areas of cryptography. Keywords- Digital Signature; SHA-224/256; IP core; FPGA #### I. INTRODUCTION SHS (Secure Hash Standard) is a hash algorithm (FIPS PUB 180-1), released by United States National Institute of Standards and Technology (NIST) in 1995. Because the algorithm is collision-resistance and non-reversible, it is widely used in the information security field at present, which are more well-known SSL, IPSec and PKCS. But as people study the algorithm in-depth, its security has also been questioned and threatened[1][2]. This has prompted NIST release the latest SHS specifications (FIPS PUB 180-3) in October 2008. With the previous version (FIPS PUB 180-2 CHANGE NOTICE, August 2002), the biggest difference is that SHA-224 algorithm has been formally included in the SHS standard. Because SHS algorithm itself is a very complex algorithm, its calculation is to a larger quantity, and each iteration needs to rely on the previous calculation, it is often used hardware implementation to increase the processing speed<sup>[3]</sup>. This paper uses the similarity between SHA-224 and SHA-256 algorithm and hardware description language to design and implement the time division multiplexing SHA-224/256 IP core. The IP core will not only be able to generate digital signature to protect the information integrity and security, but also generate the double-key of 3DES algorithm to provide a more reliable, safe, and convenient keys. So it has a broad application prospects. A typical application of SHA in the digital signature algorithm is shown in Fig.1. #### II. COMPUTER BUS MEMORY SYSTEM DESIGN SHA-224 and SHA-256 are the two kinds of algorithms in the SHS standard (FIPS PUB 180-3). They can handle input messages whose length is less than 2<sup>64</sup> bits, but the outputs are separately compressed into 224 bits and 256 bits. SHA-224 algorithm and SHA-256 algorithm have only two differences: first, the initialized hash values are different; second, the results of SHA-224 are needed to be truncated. SHA-256 algorithm has two steps to complete the calculation. The first step is to preprocess the input message to be filled and divided, generating 512 bits blocks. The second step is to calculate the hash value, that is to say, every block operates to produce the final results. After dividing blocks, every block messages can be processed by the following methods. And the details are described in reference [4]. Figure 1. Application Diagram of SHA-224/256 In Digital Signatures - (1) Giving $K_0$ , $K_1$ , ..., $K_{63}$ sixty-four 32-bits K the initial value. - (2) Giving $H_0$ , $H_1$ , $H_2$ , $H_3$ , $H_4$ , $H_5$ , $H_6$ , $H_7$ eight 32-bits variables the specified initial hash values. Every block messages is to do the from step (3) to (7). - (3) Divide the 512bits block into sixteen 32-bits words $W_0$ , $W_1$ , ..., $W_{15}$ . - (4) For i = 16 to 63 $S_0=ROTR^7(Wi-15) \oplus ROTR^{18}(Wi-15) \oplus SHR^3(Wi-15)$ $S1=ROTR^7(W_{i-2}) \oplus ROTR^{19}(W_{i-2}) \oplus SHR^{10}(W_{i-2})$ $Wi=Wi-16^{+S}0+Wi-7+S_1$ - (5) Initialize the hash value, $a=H_0$ , $b=H_1$ , $c=H_2$ , $d=H_3$ , $e=H_4$ , $f=H_5$ , $g=H_6$ , $h=H_7$ . - (6) For i = 0 to 63 $S_0 = ROTR^2(a) \oplus ROTR^{13}(a) \oplus ROTR^{22}(a)$ $maj = (a \land b) \oplus (a \land c) \oplus (b \land c)$ $t_2 = s_0 + maj$ $S1 = ROTR^6(e) \oplus ROTR^{11}(e) \oplus ROTR^{25}(e)$ $Ch = (e \land f) \oplus (\neg)e \land g)$ $t_1 = h + s_1 + ch + K_t + W_t$ $h = g, g = f, f = e, e = d + t_1, d = c, c = b, b = a, a = t_1 + t$ - (7) Add the hash values a, b, c, d, e, f, g, h respectively to the variables $H_0$ , $H_1$ , $H_2$ , $H_3$ , $H_4$ , $H_5$ , $H_6$ and $H_7$ - (8) Output 256-bits compressed code $H_0\|\ H_1\|\ H_2\|$ $H_3\|\ H_4\|\ H_5\|\ H_6\|\ H_7.$ The signs $\wedge$ , $\oplus$ , $\neg$ ,+ respectively represents bitwise AND, XOR, NOT and 32-bits addition operation. And ROTR<sup>m</sup>(Wn) represents that Wn rotates right m bits, SHR<sup>p</sup>(Wq) represents that Wq rotates right p bits. The sign | represents bitwise connect. As can be seen from the description of the algorithm, the core of the whole algorithm is the second step calculating the hash values. The first step can be achieved by the upper software. Therefore, several issues need to be solved for the calculation of hash values. - Ω Determine the data bus width. Because the message length handled by the algorithm is variable, the external data bus width and the corresponding control mode need to be determined firstly. In order to improve the portability of the IP core, this paper will use 32-bits data bus. - Determine the hardware architecture of the IP core. From the third step and sixth step of the algorithm, the relationship between production and consumption among them entirely can be handled by the parallel architecture. - $\Omega$ The multiplexing of IP core, a group of registers is used to achieve the time-division multiplexing of SHA-224 and SHA-256 algorithms. - $\Omega$ Performance and area optimization, pipelining and parallel computing architecture will be used to design simple structure and fast IP core. #### III.SYSTEM DESIGN AND IMPLEMENTATION Every sub-module of the entire IP core is designed according to the data flow of the SHA-256 algorithm. First, determine the interfaces of the IP core. Considering the portability of the IP core, 32-bits data bus and 11-bits control bus. Control bus includes clock signal, reset signal, control enable signal, function selection signal, control signal and state signal. Next, According the relationship between production and consumption of data flow, the IP core can be divided into the Data pool, ALU(Arithmetic Logic Unit), Register files and Counter four parts(shown in Fig.2). Data pool is used to save the constant and W<sub>t</sub> in the algorithm, including the initial hash value, key value, and the values of the input words and the expansion words. ALU is used to complete the arithmetic and logic operations. Register files are used as the dedicated registers to save the values of a, b, c, d, e, f, g, h. Counter is added 1 in every clock rising edge arrives to meet the iterative control. When input the corresponding data and control signal to the IP core, the IP core does iterative processing in a block (512 bits). The counter is cleared after every 64 clocks to maintain synchronization between itself and the word. Its data flow is the following: Data pool gives Wt and Kt under the control of the Counter and sends them to ALU. ALU does the corresponding arithmetic and logic operations after receiving the data, and save the results to register files until the end of this iteration. At the beginning, the blocks, the words and the end, the Register files you need to provide the corresponding results for ALU or the output bus under the control of external control signals and the Counter. # A. Data pool The Data pool consists of look-up table unit and shift register unit. Look-up table unit is responsible for looking up the key value of this iteration according to the counter value. Shift register unit is responsible for completing the expansion from 16 words to 64 words. There is sixteen 32-bits registers, respectively recorded as W<sub>0</sub>, W<sub>1</sub>, ···, W14, Wt. When each 512-bits block is processed, these registers assign and flow according to the counter value. The assignment will be done less than 16, which is assigning the first i 32-bits word to W<sub>i</sub> and W<sub>t</sub>. When the counter is greater than or equal to 16, the flow operation will be done, that is, the pipelining is used among the registers to transfer $W_{i+1}$ to $W_i$ (0 $\leq$ i $\leq$ 13) and $W_{14}$ is equal to W<sub>t</sub> after every clock in order to simplify the calculation of Wt circuit which is satisfying the following expression. $N_{-}W_{1}=ROTR^{7}(W_{1}) \oplus ROTR^{18}(W_{1}) \oplus SHR^{3}(W_{1})$ $N_{-}W_{14}=ROTR^{17}(W_{14}) \oplus ROTR^{19}(W_{14}) \oplus SHR^{10}(W_{14})$ $W_{1}=N_{-}W_{14}+W_{9}+N_{-}W_{1}+W_{0}$ $W_{1}=N_{-}W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}+W_{14}$ In the processing every word, logical operations in the every iteration may be a simple combination circuit, while the arithmetic only needs 32 bits adder to complete. From the description of the algorithm, calculating 'a' value is the longest path (It has five additions). So CSA (Carry Save Adder) of the parallel structure is used to reduce the carry signal delay[5][6] brought by the number of additions in order to improve the entire IP core speed. Due to every summand is also the intermediate result of the logical operation, it is as the input of the second level CSA. And the final calculation results are given by the CPA (Carry-Propagate Adder). The addition structure of 'a' value is shown in Fig.3. Figure 3. The Addition Structure of 'a' Value TABLE 1. SYNTHESIS RESULTS | Project name | Default Conprehensive<br>Result | Parallel CSA Structure | | |----------------------|---------------------------------|------------------------|--| | Total logic elements | 2,160 | 2,646 | | | Logic registers | 1,124 | 1,124 | | | Total pins | 75 | 75 | | | Total memory bits | 4,073 | 4073 | | | Actual fmax | 81.06MHz | 103.08MHz | | | | | | | #### IV. SYNTHESIS AND SIMULATION In this design, this IP core is described by Verilog HDL language and has been implemented to FPGA Altera Cyclone EP2C35F672C6. Then it is synthesized and routed on the QuartusII 8.0. Finally it is simulated by ModelSim<sup>[7]</sup> to test if the IP core is correct. #### A. Synthesis results Table 1 shows the comparison data whether or not using the CSA adder (Default comprehensive option), in which the performance is increased by 26% and the resource consumption is also increased by 26% after using the CSA adder. Taking into account the internal structure of FPGA, using the HardCopy technology [8] turns the IP core to ASIC achieving that the power consumption will be further reduced and the performance and speed will be increased by almost 50% [9]. # B. Timing simulation Under that the simulation clock is 100MHz, its simulation waveforms are shown in Fig.3 (SHA-224) and Fig.4 (SHA-256), in which the input test string is: 12345 678901234567890123456789012345678901234567890123456789012345678901 Figure 4. The Simulation Result Of SHA-224 Figure 5. The Simulation Result Of SHA-256 algorithm is: e1cb99de\_19ad01ca\_c1cad48b\_f5230169\_f d18aaab\_1fb2b1ec\_a48cd7d5, the result of SHA-256 algorithm is: 0be66ce7\_2c2467e7\_93202906\_00067230\_66617916\_22e0ca9a\_df4a8955\_b2ed189c. This IP core achieves the desired purpose both in function and timing (consistent with the results of Freeware Hash & CRC [10]), while the delay and the glitch phenomenon in the simulation waveform can also accurately reflect the characteristics of the circuit delay. #### V. CONCLUSION This paper uses the similarity between SHA-224 and SHA-256 algorithms to design a time division multiplexing IP core. 32-bits data bus makes this design has a friendly data interface, and the whole design has a simple hardware structure and fast running speed and can be widely used in digital signatures and 3DES key generation systems. #### REFERENCES - [1] Wang Xiaoyun, Yu Hongbo and Yiqun Lisa Yin, Efficient Collision Search Attacks on SHA-0[C], CRYPTO 2005[2] - [2] Wang Xiaoyun, Yiqun Lisa Yin and Yu Hongbo, Finding Collisions in the Full SHA-1[C], CRYPTO 2005[3] - [3] Huang Chun, Bai Guoqiang, Chen Hongyi. Fast Implementation of the hardware structure of SHA-1 algorithm[J]. Journal of Tsinghua University 2005(45)1, pp.:123-125. - [4] FIPS PUB 180-3, Secure Hash Standard[S], National Institute of - Standards and Technology (NIST), 2008 - [5] Jian Honglun. Proficient VerilogHDL: The example explanation of IC design core technology[M]. Electronics Industry Pres, 2005.10 - [6] Yang Xiaohui, Dai Zibin. FPGA-based implementation of SHA-256 algorithm[J], Microcomputer Information, 2006(22)4-2, pp.146-148. - [7] Jiang Hao, Li Zheying. FPGA design flow based on a variety of EDA tools[J], Microcomputer Information, 2007(23)11-2, pp.:201-203 - [8] HardCopy II Device Handbook, Volume 2[OL], http://www.altera.com.cn/literature/hb/hardcopy-ii/hc\_h5v2.pdf - [9] IC Technology Seminar. FPGA modular design and Altera HardCopy II structured ASIC[J], World Electronic Components, 2007,6, pp.: 38-42 - [10] febooti.com, Freeware Hash & CRC [OL], http://www.febooti.com/products/filetweak/members/hash-and-cr c/ # **MBIST Online Test For RFID Memories** # G.Maruthi Devi & G.Kiran Kumar Dept of ECE, ASCET, Nellore, AP, India E-mail: maruthi.devika@gmail.com\, kiran.ece@audisankara.com Abstract—Radio Frequency Identification (RFID) devices depend on the correct operation of their memory for guaranteeing accurate identification and delivery of transponder's information. In this paper, a novel approach for online testing of RFIDs based on March-BIST techniques for EEPROMs is presented. Online test is achieved by modifying the transponder's operation and access protocol to exploit the waiting time that transponders waste before being accessed. The solution was described in VHDL, simulated and synthesized to obtain area and timing results. Results show that the solution overhead is less than 0.1 %, while the timing performance allows to test up to 32-word blocks in a single waiting slot. #### I. INTRODUCTION Radio Frequency Identification (RFID) devices are the main constituting actors in the Internet of Things paradigm [1], where they are used to face the challenge of labeling physical objects to allow them to participate in the digital world. Such RFID devices rely on their memory to accomplish their function which range from the simple read-only transponder to the high end transponder with intelligent cryptological modules. Read-only transponders represent the low-end, low-cost segment of the range of RFID data carriers. As soon as such transponder enters the interrogation zone of a reader, a scheme to access its identification number is deployed. The tag's unique identification number is hardwired into the transponder during chip manufacture; therefore, the user cannot alter this serial number, nor any data on the chip. Writable transponders can be written by the interrogator and their memory may have several kilobits. Write and read access to the transponder is often performed in blocks of, usually, 16 bits, as in the EPC Class 1 Generation 2 protocol (C1G2) [2]. Recent developments aim at increasing RFID data rate to 10 Mbps, which entails the possibility of incrementing memory capacity to 1 Mbyte or more [3]. Considering the trend to increase memory capacity in RFIDs, a new RFID architecture and access scheme is proposed that allows concurrent online tests of the transponder memory. A built-in self-test (BIST) controller with appropriate march-tests is carefully exploited to check for memory errors. The following of this paper is organized as follows. In Section II, the general operation of the transponder and the typical organization of its memory are presented. Section III describes the regular accessing scheme of the transponder and the modifications proposed to allow the online test of the memory. In Section IV and V a description of the march algorithms utilized is shown and the BIST architecture is introduced. Section VI provides the simulation and synthesis results while in Section VII conclusions and future work are drawn. #### II. TRANSPONDER OPERATION Following a top-down approach, the transponder protocols are defined in three different layers: application, communication and physical. In the application layer, the transponder receives commands from the interrogator that are valid only when the tag has been singled out. These commands generally consist of writing, reading or locking the tag's internal memory. At this layer, an interrogator may be able to terminate indefinitely the tag's operation by issuing a password-protected command. The communication layer allows an interrogator to manage tag populations while embracing an anti-collision protocol. A great number of tags may be controlled by supervising tag's data collisions. A regular scheme to avoid collisions employs a two-part scheme where an interrogator, first, selects a broad number of tags and, subsequently, forces them to randomly choose access slots. This access mechanism is employed within the EPC C1G2 protocol and is based in the Dynamic Framed Slotted ALOHA algorithm (DFSA) [4]. To support access from several interrogators, transponders provide session flags that may be asserted or deasserted by interrogators. Session flags allow interrogators to organize groups of tags and force them to enter a particular inventory round. Transponder memory is organized in agreement with different standards, but, commonly, it follows a division in banks according to the function of the memory portion as follows: - Reserved memory, which includes passwords for accessing special tag functions. - Product Identification memory, which is a code used to identify the object containing the tag. - Tag Identifier memory, which is the unique identification number of the tag. - User memory, which is an application specific bank. #### III. TEST-ORIENTED ACCESS SCHEME The normal operation of an interrogator, when accessing a set of transponders, relies on subsequent selections of smaller groups of tags and random assignment of access slots. This selection procedure is time-consuming and does not involve reading or writing the memory for transponders that are in the interrogator queue. A selection command issued by the interrogator impels a tag or group of tags to set or unset their internal flags according to a comparison mask. In this way, an interrogator is able to split in smallest sets a larger group of tags in order to access them easily. Typically, an interrogator starts a new inventory pointing towards a previously selected set of tags. Transponders matching the interrogator's flags selection must generate an internal random Queue Position Number (QPN) which represents its assigned slot in the DFSA algorithm. The maximum QPN available for the transponders is determined by the interrogator each time an inventory starts. In order to establish a direct link interrogator-transponder, the interrogator sends a command which is answered only by transponders which QPN is equal to zero. Meanwhile, the other transponders involved in the inventory should decrement their own QPN by one, until their turn to answer the interrogator comes. The success of the anti-collission scheme relies in the effectiveness of the interrogator to select an approppriate maximum value for the QPN which avoids picking the same time slot by more than one transponder. Every transponder is accessed individually while the others remain in an Arbitrate state waiting for their access slot. In the Arbitrate state, transponders are fully powered by the interrogator signal but no particular operation is being executed. The concurrent online access scheme proposed exploits this waiting state to perform the test of the memory and is based on the anti-collision mechanism of EPC C1G2 standard. #### A. Selection Stage Every transponder works in one of four sessions and has separate inventoried flag for each. These flags determine whether the transponder may respond to the interrogator or not within an inventory round. A Selected flag (SL) also exists which purpose is to ensure a greater accuracy during management of large transponder populations. The proposed scheme introduces a Test flag which can be asserted by the interrogator to force transponders to a testing state while being accessed. An interrogator issues a Select command to select a particular transponder population by asserting or deasserting their flags. This command aims at a particular flag and forces its value, e.g., a SL flag is asserted. Within the proposed scheme, the interrogator chooses the population of tags to be tested by asserting its Test flag with the Select command. #### B. Testing Stage Fig. 1 shows the proposed finite state machine (FSM) of the transponder access scheme. Once a transponder is within the range of an interrogator, it reaches the Ready state. The Ready state is a holding state for energized transponders that are not participating in an inventory round. A transponder that is in Ready state accepts Select commands from the interrogator that force it to set or unset session flags. Ready to the Arbitrate state is The transition from the done when the interrogator broadcasts a Query command with a session flag as a parameter. Transponders matching the session flag transit to Arbitrate, the others stay in Ready and do not participate in the inventory round. Every transponder, t<sub>i</sub>, going to Arbitrate chooses randomly a QPN<sub>i</sub>. The access scheme allows the interrogator to adaptively choose an adequate interval of QPN in order to consider the number of transponders available in the inventory round or the time needed to finish the memory test. Consequently, by issuing commands to transponders, the interrogator forces them to pass from Arbitrate to Ready back and forward until the QPN interval is appropriate for the current inventory round. QPNi's valid values are defined as: $QPN_i \in [0, 2^{Q}-1]$ , with Q being chosen by the interrogator for each inventory round. Regular operation of the interrogator-transponders interaction consists of command-based transitions from the Arbitrate state to the Reply state by transponders which QPN is equal to zero. The interrogator has full access to the transponder and its memory within the Reply state. The proposed testing approach includes a new state for testing, MemTest, which sends a signal to a BIST controller to start the test of a given mfemory block and keeps track of its result. To prevent unwanted behavior, a transponder tin the MemTest state reacts only to the QueryRep command which forces the decrement of QPNi, i.e., changes to the next time slot. An extra 32-bit register is implemented in the transponder to be used as a memory block counter during the test process. The information regarding the memory block to test is sent through data lines towards the BIST. A transponder within Ready state which receives a Query command with matching flags, and with the test flag asserted, should go to MemTest state and should compute its QPN. In this case, QPN should be selected to allow the whole test of the memory, thus, the QPN value randomly chosen within the regular interval is increased by a fixed offset equal to the number of memory blocks to test. Concurrently, the memory block counter is loaded with the number of the first memory block. When the test is finished, the transponder transits to the Arbitrate state to continue with the regular operation related to accessing its information. In order to inform the interrogator that an error has been detected, the transponder should transit to the Reply state while sending a temporary random identifier accompanied with an error code. The error code describes the nature of the error and the place where it has been detected as well. In case of no error detection or while in regular operation, the transponder should backscatter only the temporary identifier. # IV. MARCH TEST ALGORITHM Many algorithms have been developed for testing semiconductor memories, from which the most popular and advantageous are the march tests [5]. A march test contains a sequence of march elements which is composed by a read/write operation that have to be performed into every cell of the memory. March tests are able to detect several fault models such as Stuck-at Faults (SAF), Address Faults (AF) and some Coupling Faults (CF). The operations that can be executed in the cells may be: write zero (w0), write one (w1), read zero (r0) and read one (r1). The read operation checks if the value inside the cell is the expected one. The order in which cells are considered can be ascending or descending. A typical march test used to test RAMs is MATS++ which can be adapted to test also EEPROMs. The MATS++ algorithm is decribed as follows: $$1 (w0); \uparrow (r0, w1); \downarrow (r1, w0, r0).$$ Word-oriented memories, such the ones found in an RFID, need a slightly different approach. By extending the 0 or to 16 bits, march algorithm can be easily applied to RFID's word-oriented memories with a reduction on the coverage of CF. # A. Symmetric Transparent Test Regular march tests produces the erase of the contents in the memory. To prevent losing data a transparent approach is introduced. The transparent method avoids traditional comparison and, instead, uses a signature analysis mechanism based on a feedback shift register [6]. Well-known march tests can be easily extended to transparent versions by replacing values 0 and 1, in the read and write operations, by a and a, respectively, where a refers to original content and acto its complement. Besides this modification, the initialization part in the original march test should be removed. A symmetric transparent test poses a constraint on the symmetry of the march test, e.g., it should have the same number of reading for the original and the complement content, since the signature mechanism computes the signature when fed by the original content and computes the reciprocal signature when fed by the complementary content. By doing so, the initial state of the signature mechanism should be found at the end of the test when the memory is fault free. Figure 2. Architecture of the memory module with the BIST controller. #### V. MEMORY BIST IMPLEMENTATION Figure 2 shows the architecture of the BIST module composed by six entities: offset generator, memory input multiplexer, output multiplexer, BIST controller, signature analyzer and test pattern generator. The function of the input multiplexer is to choose which signals input to the memory according to the BIST mode. The output multiplexer provides constant values and the ready/busy (RB) signal is set to zero throughout all the test period. The offset generator is a module that modifies incoming address depending on the bank selected for the memory during regular operation. The BIST controller captures the init signal from the transponder's FSM and starts the test procedure. The test pattern generator is responsible for generating the test vectors to be introduced to the memory. It contains the sequence and directions of the march test in a configuration array. Its implementation consists of a FSM which takes information from the configuration array and performs their instructions, while the complement of the data read from the memory is used as input when needed. The signature analyzer is a Multiple Input Shift Register (MISR) with a flow signal that sets its direction of propagation. This implementation avoids the use of two different shift registers for the signature and the reciprocal signature computation. To reduce the probability of error masking, an irreducible polynomial was selected for the MISR; it has the following form: $$h(x) = 1 + x^7 + x^9 + x^{12} + x^{16}$$ Additional methods to avoid error masking involves hardware solutions, e.g., additional check parity, the use of hamming codes or larger MISRs, which are undesirable for the constrained RFID system due their overhead. ### VI. SIMULATION AND EVALUATION The proposed scheme was synthesized and simulated in order to evaluate its performance regarding timing and area Figure 3. Test time for transparent and basic MATS++. Figure 4. Test time for transparent and basic March C-. #### Table I BIST AREA OVERHEAD | Technology | Memory Are | a BIST Area | Overhead | |------------|---------------------|-------------|----------| | 0.65 μm | 9.7 mm <sup>2</sup> | 0.0094 mm | 0.1% | overhead. The BIST scheme was described in VHDL and synthesized using a $0.65~\mu m$ technology. The BIST used the transponder's internal clock signal which is obtained from the interrogator carrier frequency, and was chosen equal to 1~MHz. The evaluation of the area overhead was calculated considering the memory since it is the largest component of the transponder. A memory capacity of 1 kB was assumed, which is, in average, larger than the capacity of most of current passive transponders. The area overhead was computed as AO = BIST Area Memory Area\*100%. To obtain realistic values for the memory area, the data was extrapolated from [7]. The results related to the memory overhead are shown, for this particular case, in Table I. Passive transponders are equipped with a capacitor charged by the electromagnetic field generated by the interrogator. Continuous read and write operations during the test causes high current consumption, hence a charge in the capacitor can rapidly fall down. As an example, circuit presented in [8] contains a 250 pF capacitor which stores energy supplies during short gaps in the received signal for about 100 μs. In such time, it is possible to perform some read operations, but writing could be interrupted. Thus, testing of a single memory block should be as short as possible to decrease the risk of that situation. As a safe threshold, the time of the longest operation specified by the EPC C1G2 standard was assumed as the limit for the testing operation of a memory block in the RFID, i.e., 20 ms. To evaluate the timing performance of the circuit two march tests were executed: the MATS++ algorithm, described before, and the March C- algorithm. The March C- algorithm has a higher complexity than MATS++ and is described in the following in its transparent version: $\uparrow$ (ra<sup>c</sup>); $\uparrow$ (ra,wa<sup>c</sup>); $\uparrow$ (ra<sup>c</sup>,wa); $\downarrow$ (ra,wa<sup>c</sup>); $\downarrow$ (ra<sup>c</sup>,wa); $\downarrow$ (ra). Figure 3 and 4 present the results of the simulation in terms of timing for the MATS++ and March C- algorithms respectively. The simulations were performed varying the testing block sizes. Furthermore, the timing information of the basic approach is also presented to compare with the transparent approach. The 20 ms threshold is also highlighted for convenience. As can be seen in the simulations results, the absence of the initialization stage in the transparent approach provides an interesting reduction of test time. In average, the time is reduced by 32 % for MATS++ and by 20.5 % for the March Calgorithm. These simulations show the maximum block size which can be tested within one single slot according to the algorithm utilized. For the MATS++ algorithm, the maximum testing block size is 32 words, while for the March C- the maximum is 16 words. #### VII. CONCLUSIONS AND FUTURE WORK A novel access scheme supporting online test for RFIDs was presented. The novel scheme take advantages of the idle state of transponders while waiting to be accessed by the interrogator to perform the test of their internal memory. The transponder finite state machine describing the access scheme was presented and the architecture of the transparent BIST circuit was described. Synthesis and simulation results show the feasibility of the proposed scheme. Area results show the negligible overhead of the BIST in terms of area compared with the memory size, i.e., about 0.1 %. Timing results present the maximum size of blocks that can be tested within one slot of the accessing scheme by considering two different march algorithms. Future work will include other testing approaches which provides a direct testing command to the interrogator and a larger list of supported march algorithms. #### **ACKNOWLEDGMENTS** Authors acknowledge support by the "Nano-materials and -technologies for intelligent monitoring of safety, quality and traceability in confectionery products" (NAMATECH) project from Regione Piemonte, Italy. # REFERENCES - [1] E. Welbourne, L. Battle, G. Cole, K. Gould, K. Rector, S. Raymer, M. Balazinska, and G. Borriello, "Building the internet of things using rfid: The rfid ecosystem experience," Internet Computing, IEEE, vol. 13, no. 3, pp. 48–55, may-june 2009. - [2] EPCGlobal, EPC radio-frequency identity protocols Class-1 Generation-2 UHF RFID air interface Version 1.2.0, Oct. 2008. - [3] J. McDonnell, J. Waters, H. Balinsky, R. Castle, F. Dickin, W. W. Loh, and K. Shepherd, "Memory spot: A labeling technology," Pervasive Computing, IEEE, vol. 9, no. 2, pp. 11–17, apriljune 2010. - [4] T. Cheng and L. Jin, "Analysis and simulation of rfid anti-collision algo- rithms," in Advanced Communication Technology, The 9th International Conference on, vol. 1, 2007, pp. 697–701. - [5] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits. Springer, 2000. - [6] S. Hellebrand, H.-J. Wunderlich, and V. Yarmolik, "Symmetric transpar- ent bist for rams," Design, Automation and Test in Europe Conference and Exhibition, vol. 0, p. 702, 1999. - [7] D. R. Banerjee, S; Chowdhury, "Built-in self-test for flash memory embedded in soc," in Third IEEE International Workshop on Electronic Design, Test and Applications, DELTA 2006., January 2006. - [8] U. Karthaus and M. Fischer, "Fully integrated passive uhf rfid transponder ic with 16.7- mu;w minimum rf input power," Solid-State Circuits, IEEE Journal of, vol. 38, no. 10, pp. 1602 1608, 2003. # Design and Implementation of On-line Interactive Data Acquisition and Control System for Real Time Embedded Applications using Beagle Board #### Sanjeev V Patil# and Sunil Mathad\* <sup>#</sup>Digital Electronics, dept ECE SDM College of Engineering and technology, Dharwad, Karnataka, India E-mail: <sup>1</sup>Sanjeev.vpatil@gmail.com Abstract- Design of On-line Interactive Data Acquisition (IDACS) and its control System is a challenging part of any measurement, automation and development of on-line Interactive Data Acquisition and Control System applications. This system uses the standard Internet Protocol Suite (TCP/IP) to serve billions of users worldwide. This system uses Beagle Board portability with Real Time Linux operating system (RT LinuxOS) it makes the system more real time andhandling various processes based on multitasking, and reliable scheduling mechanisms. This paper approached towards the design and development of on-line Interactive Data Acquisition and Control System (IDACS) using Beagle Board based embedded web server. Web server application is ported into a Beagle Board using embedded C'and JSP (JAVA) language. Web pages are written by Hypertext mark-up language (HTML). **Keywords**-Beagle Board, Apache Tomcat Web Server, Ultrasonic Sensor, RTLinux Operating System, Interactive Data Acquisition System (IDACS), Camera etc. #### I. INTRODUCTION The Development of Embedded Technology, embedded data acquisition and remote monitoring technology in production data monitoring applications has become a new trend. On-line Interactive Data Acquisition and Control system plays the major role in the rapid development of the fast popularization and control in the field of measurement and control systems. It has been designed with the help of much electrical, electronic and high voltage equipment; it makes the system more complicated and not reliable. This paper approaches a new system that contains inbuilt Data Acquisition and Control system (DACS) withinbuilt ADC along with an on-line interaction. It makes the system more reliable and avoids more complication. It is demand application in consumer manyindustries. This system replaces various complex cables which are used for acquisition and it uses Beagle Board for data acquisition and digital diagnosis. A single worker can the machine and collect the various data fromon-going work in a single work station and controlling of machine would be easier The simplest design of data acquisition system is detailed in [1]. Which is Based on Linux Operating System [2], The design of flexible, reliable, data Acquisition architecture was approached in [3]. Where the software resources are stored in local memory to avoid the level of resource usage and increases system's efficiency. This system process the client based on dynamic manner by server responseand it maintains separate data base with DAC controller. Where the shared memory and internet protocols (TCP/IP) are used for data handling and process from remote users. This system we can also develop with global positioning system (GPS) and environmental monitoring system. It reduces the system complexity and effective for all kind of real time applications. Every real time embedded system should be run by real time operating systems. Even a small 8bit microcontroller has the portability with RTOS is developed in [4]. In this paper Real time Linux Operating system is ported in Beagle Board. Generally all Beagle Board versions have portability with higher end RTOSes. This RTLinux RTOS is very effective for many embedded application discussed in [5] & [6]. Here the embedded web server application is developed and ported into Beagle Board with this setup. This single Beagle Board has been act as data acquisition unit, control unit, embedded web server andselfdiagnosis. All processes are allocated with essential resources and associated with reliable scheduling algorithms and internet protocols followed by Beagle Board. This miniaturized setup reduces the complexity & size of system and whichgives the good performance as well. Fig.1 System overview Fig 1 shows the overview of Interactive Data Acquisition and Control system. Every client can access the industry directly without any interaction with additional server and modules. This system contains single Beagle Board which is portable with Real Time Linux RTOS.BeagleBoard is the heart of this work. It handles two modes at same time, DAC and Web server. During DAC mode Processor can measure signals which are coming from various external sources and applications. And it can control the industry machineries by the control instruction sent by client via embedded web server, apart from that client can directly watch live video they can see what's happening at the industry side and going forward client can take appropriate action. This system uses RTLinux so it can handle many interrupts in an efficient manner because RTLinux has pre-emptive kernel with required privilege levels. Similarly during web server mode processor will handle client request and response to the particular client by sending web pages, client can interact the industry by giving instruction in web page on its own web browser. This setup can be suitable for inter communication with other nodes via Ethernet and higher end ports. Ethernet programming and execution is very easy and adaptable with various applications. Embedded web pages are designed by HTML language. # II. SYSTEM DESIGN Hardware, Software Requirement, and Porting are the important steps in whole system design. A. Hardware Designof the System #### 1) IDACS Design Interactive data acquisition and control system design is the major part in hardware. Beagle board is a centre core of this system. The generalhardware structure of the IDACS is shown in Fig 2. Theon-line intelligent data acquisition and control system basedon embedded Beagle Board platform has high universality, eachacquisition and control device equipped with acquisition/control channels and isolated from each other. The measureddata are stored in external memory (Micro-SD Card) in which the memory isact as a data base during web server mode. The Beagle Board directly supports the Ethernet service and RS232 communication. Hence the data has been stored andcontrolled by some other PCs or network via RS232& Ethernet. Beagle Board has internal I2C, I2S, and SPI module. So it hasthe ability to communicate with any other peripherals with very much good speed. Fig. 2 General Structure of the IDACS (BB Core) I2S is interface is provided on Board (integrated interchip sound) is an electrical serial bus interface standards used for connecting digital audio device together.I2C is the wired communication protocol to communicate with other processor or peripherals through two wired link, it is used to connect the low speed peripheral to embedded system. This system has uses LED's to display the information about status of the machine and we can also interface the LCD which makes the debugging and modification of the parameter easy. As the embedded Ethernet interface makes the remote data exchange between the applications become very easy. The Ethernet is one of the most popular packet-switched LAN technology with a bandwidth available of such as 10Mbps, 100Mbps, and 1Gbps and whereas bus has maximum length of 2500m (500m segments with 4 repeaters). # 2) Camera and Ultrasonic Sensor This system uses the camera where in which it keep on sending live update's to the client, client have a direct access of on-line records and in the mean client can see the live video sitting at the remote place's and later they can go ahead and control the machine and apart from this we can also connect the ultrasonic sensor, if anybody disturbs to the sensor(any obstacle) it sends an updated value to the remote client, based on the sensor value retrieved by the clientcan take appropriate action. Apart from this if we interface the temperature sensor means we can monitor room temperature also. #### 3) Beagle Board-xM This system uses Beagle Board portability with Real Time Linux operating system,BeagleBoard-xM delivers extra ARM <sup>®</sup> Cortex <sup>TM</sup> -A8 MHz now at 1 GHz and extra memory with512MB of low-power DDR RAM, this board has an open hardware design improves upon the laptop-like performance and expandability and Direct connectivity is supported by the on-board fourport hub with 10/100 Ethernet, while maintaining a tiny 3.25" × 3.25" footprint.DM3730 processor is the heart of the Beagle Board-xM. #### 4) RS232 Communication This RS232 DB-9 (usually called DB-9) port is very common and available at most of any Devices and many other computers, to allow compatibility among data communication equipment made by various manufactures, an interface standard called RS232 was set by the Electronics Industries Association (EIA)in 1960[7].today RS232 is most widely used serial I/O interfacing standard. This system serial port interface is single ended (connects only two devices with each other), the data rate is less than 20 kbps. #### B. Software Design of the System #### 1) Real Time Linux RTLinux is a hard real-timeRTOSmicrokernel that runs the entire Linuxoperating system as a fully preemptive process, It was developed by Victor Yodaiken.RTLinux provides the capability of running special real-time tasks and interrupt handlers on the same machine as standard Linux [8].RTCore is a POSIX 1003.13 PE51 type real-time kernel, something that looks like a multithreaded POSIX process with its own scheduler [9]. RTCore can run internal secondaryoperating system as a thread, this is a peculiar model: a UNIX process with a UNIX operating system as a thread, but it provides a useful avenue to modularity. RTLinux is RTCore with Linux as the secondary kernel. Real-time applications run as realtime threads and signal handlers either within the address space of RTCore or within the address spaces of processes belonging to the secondary kernel. Real-time threads are scheduled by the RTCore scheduler without reference to the process scheduler in the secondary operating system. The secondary operating system is the idle thread for the real-time system. The virtual machine virtualizes the interrupt controller so the secondary kernel can preserve internal synchronization without interfering with real-time processing. Fig .3 RTLinux Run Time Model As unlike Linux, RTLinux provides hard real-time capability. It has a hybrid kernel architecture with a small real-time kernel coexists with the Linux kernel running as the lowest priority task. This combination allows RTLinux to provide highly optimized, timeshared services in parallel with the real-time, predictable, and low-latency execution. Besides this unique feature, RTLinux is freely available to the public. As more development tools are geared towards RTLinux, it will become a dominant player in the embedded market. RTLinux is a typical dual-kernel, one is Linux kernel, which provides various features of general purpose OS, other one is RTLinux kernel, which support hard real time capability. Fig 3 illustrates the RTLinux architecture. # 2) System Design Flowchart Fig. 4 Client side Program Flow The Client can enter into the server by entering appropriate URL (with known IP address of Board), once the server responds it will ask for username and password, after giving valid username and password, client can access the industry by reading sensor value or based on video access, and client can take appropriate action on the machine. Fig 5.Server side Program flow. Configure Apache by exporting an appropriate path to Apache Tomcat, then it starts the server(now beagle board acting as web server),once server gets started, it will waits for the clients request. # 3) Apache Tomcat (Web Server) Apache Tomcat (or simply Tomcat, formerly also called Jakarta Tomcat) is an open sourceweb server and servletcontainer developed by the (ASF) [10]. Tomcat implements the Java Servlet and the JavaServer Pages (JSP) specifications from Oracle Corporation, and provides a "pure Java" HTTPweb server environment for Java code to run. It provides a java virtual machine and also associates the elements to give a complete java runtime environment and it also provides web server software to make the environment accessible on Web. Fig. 6Structure of Apache Tomcat. The above figure shows the General Structure of Tomcat which mainly consists of Servlet/JSP Container, HTTP connector and JSP engine. Tomcat runs as a Windows service or Linux or Unix Daemon, awaitingconnections (by default) on port 8080[11]. A single instance of Tomcat can provide several service, through this is unusual. Each Tomcat service will be have at least one (and possibly more)connectors, and at least one in which an engine such as catalane provides a service's #### 4) TCP/IP (Internet Protocols) Lightweight IP (LwIP) is a widely used open sourceTCP/IP stack designed for embedded systems, LwIP was originally developed by systems. Improvements achieved by LwIP in terms of processingspeed and memory usage, Most TCP/IP implementationskeep a strict division between the application layer and thelower protocol layers [12]. As in many other TCP/IP implementations, the layered protocol design was used as a guide for the LwIP design and implementation [9]. This protocol implements inorder to improve performance both in terms of processing speed and memory usage. Hence RTLLwIP is more suitable for embedded systems. #### III. RESULTS AND DISCUSSIONS Fig.7Booting Linux on the Beagle Board-xM Fig.8Apache Tomcat running at the Server side The above Figure 7 shows for the Booting Linux on the Beagle Board, once the board is booted later we configure the Apache as shown in Figure 8, the remote client can enter the server by giving authorised username and the password, after entering in to the Apache Server the user has to access industry or they can control the machine. Fig.9 Web Page requested by Client The above Figure shows a simple web page designed using HTML language. It is requested by the client to server. Then the internet processes these request and server response for client request with web page. Now the Client can know the status of industry machineries and can control the machines via its own browser from remote location along with this client can have video access over their own browser. It is showed in Fig.10, 11&12. Fig. 10 Host System and the Target Board Hardware Fig. 11 Video accessible by the remote client Fig. 12 Web page sent by Target board to the remote Client . Hence, results show that the client can access the wholeindustry from any remote place via its own local browser. In industry the single Beagle Board acts as data acquisition and control system and as web server, so the system is compact with less complexity. This system replaces the traditional system for remote access and control by embedded web server with Real Time Linux operating system. #### IV. MERIT OF THE SYSTEM # A) Existing System The use of single chip Data acquisition system (DAS)method in Instrumentation and process control application isnot only limited in processing capacity, with limited memory and also the problemof poor real time and reliability. General web server requiresmore resources and huge amount of memories. This systemcan only measure the remote signals and it cannot be used tocontrol the process and also which is having limited capacity and lack in performing a real time operation. Thissystemuses ARM9 embedded processor which is of very limited memory and processing speed also low.to overcome this problem we gone for proposed system. #### B) Proposed System Because of the limited processing capacity, limited memory and the problem of poor real time and reliability of DAS system has been overcome by the substitution of embedded Beagle Board for single chip method to realize interactive data acquisition and control (IDACS). This IDACS system can able to measure the remote signals and can control the remote devices through reliable protocols and communication network (TCP/IP). Advancement in technology is very well reflected and supported by changes in measurement and control instrumentation. This system uses the Camera and Ultrasonic Sensor which are helps in monitoring the machine and in mean while the client can also see the live video while accessing at on-line by seeing this video still they can have control and monitoring of the machine's. And also this system uses RTLinux Multitasking operating system to measure and control the whole process (fully pre-emptive). And the embedded web server mode requires less resource usage, high reliability, security, controllability and portability. #### V. CONCLUSIONS An advancement in technology is very wellreflected and supported by changes in measurement and control instrumentation, with the rapid development of the field of industrialprocess control and the wide range of applications of network, intelligence, digital distributed control System, it is necessary to make a higher demand of the data accuracy andreliability of the control system This embedded Beagle Board system can adapt to the strict requirements of the data acquisition and control system such as the reliability, functionality, size, cost, power consumption, and remote access and so on. This system operated by DACS mode to acquire the signals and control the devices remotely. Embedded web server (Apache Tomcat) mode is used to share the data with clients in online. Both modes are efficiently carried out by real time multi-tasking operating system (RTLinuxos). This system can be Widely applied to petroleum, metallurgy, chemical, electric power, transportation, Electronic & Electrical industries, Automobiles, Home security and so on. In the future implementation we can also interface the temperature sensor, and same system can also develop with GPRS. #### REFERENCES - [1] K.JackerandJ.Mckinney, "TkDAS- A data acquisition system using RTLinux, COMEDI, And Tcl/Tk," inProc. Third Real Time Linux Workshop, 2001. [Online]. Available: The Real Time Linux. - [2] Jerry Epplin: "Linux as an Embedded Operating System", Embedded Systems Programming. - [3] Clyde C. W. Robson, Samuel Silverstein, and Christian Bohm, "AnOperation-Server Based Data Acquisition System Architecture," *IEEE Trans. Nuclear science*, Vol. 55, No. 1, February 2008. - [4] Tran Nguyen, Bao Anh, Su-Lim Tan, "Real-Time Operating Systems for small microcontrollers", IEEE Comp society, pp. 31-45, September 2009. - [5] Yodaiken, Victor (1996). "Cheap Operating systems Research" Published in the Proceedings of the First Conference on Freely Redistributable Systems, Cambridge MA, 1996. - [6] Barabanov, Michael (1996). "A Linux Based Real-Time Operating System. - [7] Electronics Industries Association, "EIA Standard RS-232-C Interface Between Data Terminal Equipment and Data Communication Equipment Employing Serial Data Interchange", August 1969, reprinted in Telebyte Technology *Data Communication Library*, Greenlawn NY, 1985, no ISBN. - [8] Yodaiken, Victor (1999). RTLinux, real-time Linux, FMSLab, Soccoro, New Mexico, USA. - [9] IEEE Std 1003.1b-1993 IEEE Standard for Information Technology. Portable operating System interface (POSIX) part 1: System application Programming interface, amendment 1: Real-time extensions. Technical report, IEEE, New York, 1994. - [10] "The Apache Software Foundation Announces Apache TomEE Certified, as Java EE 6 Web Profile Compatible". Market Watch. 4 Oct. 2011. - [11] "Building Complex VDK/LwIP Application Using Blackfin Processors" Kaushal Sanghai, Analog Devices Inc. September 2008. - [12] TCP/IP Illustrated, Volume 1 The Protocols, W. Richard Stevens. Published by Addison- Wesley. ISBN 0-20-163346-9. This book gives many useful low-level details About TCP/IP, UDP and ICMP. - $[13] \quad Beagle Board-http://\ www.beagle board.org$ - [14] RTlinux-http://www.rtlinux.org - [15] Apache Tomcat-http://tomcat.apache.org - [16] www.opensourcelinux.org # **Intelligent Car Parking System** #### S. Avinash, Sneha Mittra, Sudipta Nayan Gogoi & C. Suresh R V College of Engineering, R V Vidyaniketan Post Office RVCE, Mysore Road, Bengaluru, Karnataka-560059 Abstract - Due to the proliferation in the number of vehicles on the road, traffic problems are bound to exist. This is due to the fact that the current transportation infrastructure and car parking facility developed are unable to cope with the influx of vehicles on the road. In India, the situation are made worse by the fact that the roads are significantly narrower compared to the west. Therefore problems such as traffic congestion and insufficient parking space inevitably crops up. In his paper we describe an Intelligent Car Parking System, which identifies the available spaces for parking using sensors, parks the cars in an identified empty space and gets the car back from its parked space without the help of any human personnel. A Human Machine Interface (HMI) helps in entering a unique identification number while entry of any car which helps in searching for the space where the car is parked while exit. An Indraconrol L10 PLC controls the actions of the parking system. The PLC is used to sequence the placing and fetching of the car via DC motors. We have implemented a prototype of the system. The system evaluation demonstrates the effectiveness of our design and implementation of car parking system. Keywords - Proximity sensor, Infrared sensor, DC motor, Programmable Logic Controller. #### I. INTRODUCTION The Intelligent Car Parking System is being designed for developing a user friendly, intelligent and automated car parking system which greatly reduces manpower, land area for parking, fuel consumption of the vehicle and reduces carbon emissions. A few existing studies focused on the applications of car parking system using sensor technologies. Few systems adopt cameras to collect the information in car parking field. However, a video sensor has two disadvantages; one is that a video sensor is energetically expensive and the other is that a video sensor can generate a very large amount of data which can be very difficult to transmit in a wireless network. These greatly limit the application of video sensor. The system in [1] adopts wireless sensors in a car park field and each parking lot is equipped with one sensor node, which detects and monitors the occupation of the parking lot. The status of the parking field which is detected by sensor nodes is reported periodically to a database via the deployed wireless sensor network and its gateway. The database can be accessed by the upper layer management system to perform various management functions, such as finding vacant parking lots, auto-toll, security management, and statistic report. The system in [2] adopts vision based system which is able to detect and indicate the available parking spaces in a car park. The methods utilized to detect available car park spaces were based on coordinates to indicate the regions of interest and a car classifier. The work done indicated that the application of a vision based car park management system would be able to detect and indicate the available car park spaces. Unlike the above ideas, where it is only aimed at finding out the vacant car parking space, the idea of making a prototype of an Intelligent Car Parking System was thought of, where this system would identify the available spaces for parking using sensors, park the cars in an identified empty space and gets the car back from its parked space without the help of any human personnel. In this system it is required to sense empty space and the respective floor where the car can be parked. This could be done with the help of sensors so; various sensors that are available are temperature sensor, Infrared sensor, UV sensor, Touch sensor, proximity sensor, bio sensor, image sensor and acceleration sensor. Thus, for finding the floor in which the empty space is present, Infrared sensors are used as it emits and /or detects infrared radiation to sense a particular phase in the environment. It is easy to interface and is readily available in the market. Now, for sensing the empty space in a particular floor, proximity sensors are used as it detects the presence of objects that are nearly placed without any point of contact. This system also required a motor for the movement of the mechanical arm which would be placing and fetching the car. The motors that are available are DC motors and stepper motors. So, it was thought of using DC motor as, DC Motors have several advantages over stepper motors. When it comes to speed, weight, size, cost, DC motors are always preferred over stepper motors. The direction of rotation of motor can be controlled, so, the encoding of the rotation can be made by DC motor i.e. keeping track of how many turns are made by the motors etc. Stepper motors jerks when it rotates. So it can be seen that DC motors are better than stepper motors. Thus two DC motors are used, one for vertical movement of the mechanical arm and the other one for the clockwise and counter clockwise rotation of the mechanical arm. The two way rotation of the DC motor is achieved by using the L293D dual motor driver circuit. This system also required a controller to control the functions of the whole system. The various types of controller's are Microcontroller, PIC microcontroller, micro processor and Programmable Logic Controllers. So, Programmable logic controller (PLC) is chosen as it is easy for technical's to be dealing with ladder logic more than C or assembly or other programming language. The ladder logic is a very simple way of interfacing it like turning motors on and off based on a set of inputs. In PLC the logic is obvious and easily modifiable. PLCs more suitable for industrial applications, they can bear the dust and hits. Based on this Literature Survey, the idea adopted for the project entitled "Intelligent Car Parking System" was to use Infrared Sensors instead of wireless sensors as it is of low cost, readily available, proximity sensor, DC motor instead of stepper motor and Programmable Logic Controller. #### II. REQUIREMENT ANALYSIS In this section, we describe the requirements of designing an intelligent car parking system. Although the conventional requirements of a car parking system can be easily satisfied. In the following we list, some important requirements of a car parking system. The common goal for all car parks is to attract more drivers to use their facilities from the business aspect. Thus, their basic facilities are required to fulfill the following conventional requirements: - (a) The location of the car park should be easy to find in the street. - (b) The entrance of the car park should be easy to discover. - (c) The number of parking lots should be abundant. - (d) A parking lot should obtain a large space enough to park a car in. (e) Easy to exit and re enter on foot. However an intelligent car parking system should provide more convenience and automation to both the business and the customers. It should also satisfy the following requirements: - (a) It should have a compact structure, Low cost of ownership and simple; user friendly safe retrieval process prevents damage to vehicles. - (b) It should provide environmental protection by reducing vehicular emissions and general energy savings. - (c) It should provide safety of vehicle by preventing damages or dents to the car are avoided while parking through narrow drive ways. - (d) It should easily accommodate all types of car's and SUV'S. In accordance with the above requirements, an intelligent car parking system should minimize human operations and supervisions, so as to reduce the manpower and the loss from human mistakes. Also, the car park system is required to provide higher accuracy, robustness and flexibility in operations, more convenience to customers, lower cost of operating and maintaining overall system. #### III. DESIGN AND IMPLEMENTATION In this section we describe the design of the Intelligent Car Parking System which deals with the sensors and relay circuits and then the motor control unit. # 1. Sensors and Relay Circuits Electronic components used in the structure enable complete control and management of the car parking system. Various sensors are used to find out each empty space along with the floor in which the empty space is present. #### 1.1 Proximity Sensor A proximity sensor is a sensor able to detect the presence of nearby objects without any physical contact. It often emits an electromagnetic field or a beam of electromagnetic radiation (infrared, for instance), and looks for changes in the field or return signal. The object being sensed is often referred to as the proximity sensor's target [3]. The system consists of two floors for parking cars hence, eight proximity sensors are being used, four in each floor. The voltage range of each proximity sensor is 5 to 12 V and sensitivity range varies from 10mm to 50mm. The proximity sensors are used for detecting the presence and absence of a car parked in each parking space of each floor. It continuously emits infrared rays and checks for an empty space. This acts as an input to the Programmable Logic Controller which in turn helps in the placing and fetching of cars. The Figure 1.1 shows the proximity sensor. Fig. 1.1: Proximity sensor #### 1.2 Infrared Transmitter Infrared led is used for the purpose of detection. The system consists of twenty-one Infrared Transmitters [4]. Thin sheet metal strips are placed behind each car park space and one behind the place where the driver is asked to keep the car i.e., in the ground floor. Behind each car park space and in front of each floor, two Infrared Transmitters are placed at a vertical distance of 4cm. These are shorted so that they start emitting at the same time in case of a free space available. A single Infrared Transmitter is placed in the ground floor which continuously emits until the receiver receives it. This is done so that the mechanical arm can detect the presence of a car which has arrived for parking. Two Infrared Transmitters are used for each floor. While fetching a car from its parked space, the first (the one below the floor) Infrared Transmitter is turned on whereas while placing a car, the second (the one above the floor) Infrared Transmitter is turned on. In case of placing a car, an empty space is searched by the proximity sensor and then its respective IR Transmitter and floor transmitter is turned on, which together guides the arm movement and helps in placing the car. The Figure 1.2 shows the Infrared Transmitter. Fig. 1.2: Infrared Transmitters #### 1.3 Relay Panel Board A relay is a kind of switch which is controlled by an electric current. A relay panel board is used instead of adding a mains switching relay. It is a commercially manufactured circuit board fitted with a relay, LED indicator, back EMF preventing diode, and easy to use screw-in terminal connections [5]. In the system, three 12V and 250v/10 Amp AC, 8 relay boards are being used to switch the low voltage IR sensors output voltage to the 24 V DC which will be used by the programmable logic controller. The Figure 1.3 shows the Relay Panel Board. Fig. 1.3: Relay Panel Board #### 1.4 Voltage Divider Circuit A voltage divider (also known as a potential divider) is a linear circuit that produces an output voltage (Vout) that is a fraction of its input voltage (Vin). Voltage division refers to the partitioning of a voltage among the components of the divider [6]. Fig. 1.4: Voltage Divider Circuit In this system, a voltage divider circuit is required for converting the output voltage of 24 V of the PLC to a voltage of nearly 5V which is used by the sensor. The output of the voltage divider circuit is found out by simulating it in PSPICE simulator as shown in Figure 1.4. #### 2. Motor Control Unit Electrical components which are primarily used are the standard DC Motors that help in accomplishing the motion of the mechanical arm. The DC Motor used in the system is of 12 V DC. DC motors are controlled using Indracontrol L10 PLC. #### 2.1 Main Controller Board Indracontrol L10 Programmable Logic Controller (PLC) serves as the Intelligent Car Parking System's "brain", controlling and managing all the functions of the car parking system. The Indracontrol L10 Programmable Logic Controller that is used in the system operates on a 24 V DC supply which is in built in it. A Human Machine Interface is used in this system for entering an unique identification number while entry and exit of a car. The communication port that is being used in the system is Ethernet. To store the program, a SanDisk firmware flash card is used [7]. #### 2.2 DC Motor A DC motor is an electric motor that runs on direct current (DC) electricity [9]. Two Standard DC Motors are being used in this system. The DC Motors performs the task of moving the mechanical arm in the clockwise and counter clockwise and vertical direction for the placing and fetching of cars. Using the L293D Dual H Bridge converter, the motor is made to rotate in both clockwise and counter clockwise direction. #### IV. PROGRAMMING AND TESTING Programming of the system enables its working. Using ladder diagram logic, instructions that collectively define the process of its parts movement are specified and compiled [8]. This logic is subsequently downloaded onto the Programmable Logic Controller, which performs the required motor functions by using the DC motors and the placing and fetching of car by using the mechanical arm. Also essential synchronization of the parts is achieved by the Programmable Logic Controller. Programming of the system is desired for the following functions: - To find out an empty space from the inputs of the Proximity sensors. - To make the transmitter of the corresponding, floor and ground floor turn on. - To make the DC Motor rotates in both clockwise and anticlockwise motion. #### **HMI Working** #### Flowchart 1.1 The Flowchart 1.1 explains the operation of Human Machine Interface which will ask the user to press F1 in case of entry and F2 in case of exit. If entry is selected, then the user is asked to enter the given code and in case of exit, the user is asked to enter the unique code. #### Placing of a car #### Flowchart 1.2 The Flowchart 1.2 explains the operation while placing a car. If the empty space is first floor then, the first floor's upper transmitter is switched on and the mechanical arm carrying the car is raised till it reaches first floor (detected by arm receiver). Then, the mechanical arm is rotated in anticlockwise direction using second motor simultaneously first motor is switched off. When the arm reaches the respective space, arm is lowered by rotating the first motor in opposite direction and finally car is place and the mechanical arm is brought back to its initial position by controlling both the motors sequentially. The Flowchart 1.3 explains the operation while fetching a car. If the car to be fetched is in first floor then, the first floor's lower transmitter is switched on and the mechanical arm is raised till it reaches first floor (detected by arm receiver). Then, the mechanical arm is rotated in anticlockwise direction using second motor simultaneously first motor is switched off. When the arm reaches the respective space, arm is raised by rotating the first motor in opposite direction and finally car is brought back to its initial position by controlling both the motors sequentially. #### V. RESULTS For the Intelligent car parking system to work properly, the inputs from the Human Machine Interface was given and hence the following results were obtained. The following figures shows the sequence of operation while placing a car. Fig. 1.6 Mechanical arm lifting a car from ground floor Fig. 1.7: Mechanical arm receiver sensing the rays from the first floor transmitter Fig. 1.8 : Mechanical arm rotating in anticlockwise direction for placing car Fig. 1.9: Mechanical arm receiver sensing the rays of the empty space transmitter and placing it Fig. 2.0 : Mechanical arm moving downwards towards the ground floor Fig. 2.1: Mechanical arm reaches the ground floor After entering entry in the HMI and then entering the unique code, it was seen that the mechanical arm lifts the car from the ground floor as shown in Figure 1.6. Then, the mechanical arm starts moving upwards with the help of the screw jack mechanism and the first DC Motor and as soon as the mechanical arm receiver receives the rays from the first floor's transmitter, the arm rotates in anticlockwise direction. The Figure 1.7 shows the mechanical arm sensing the first floor and Figure 1.8 shows the rotation of the mechanical arm for placing the car in an empty space. Then, the mechanical arm receiver senses the rays from the space transmitter and then places the car. This is shown in Figure 1.9. After placing the car, the mechanical arm rotates in clockwise direction and then gets down. This is shown in Figure 2.0. Then, finally, the arm gets back to the ground floor and the motor stops. This is shown in Figure 2.1. Similarly, the fetching of a car operation takes place in the opposite manner. #### VI. CONCLUSION AND FUTURE SCOPE The mechanical model of the Intelligent Car Parking System is compliant to achieving the proper placing and fetching of cars. The model is rigid and is capable of housing and shielding the DC Motors, Proximity sensors, IR transmitters and an IR receiver. Three separate supplies were used in this system, a 24V, 12 V and a 5V supply. The 24V supply powers the L10 Programmable Logic Controller which is in built in it, which acts as the brain of the system, manages and controls all the functions of the system. The DC Motors which are used for the clockwise counter clockwise and vertical up and down movement of the mechanical arm is driven by a 12V supply given from a 12V adapter. The proximity sensors which are used to find out an empty space for placing car is powered by a 5V supply which is given by a 5V adapter. Detection of parking slot availability is achieved by using proximity sensors. Infrared transmitters and an Infrared Receiver is used for detecting the floor as well as the space which has got an empty space to park a car. Proximity detects the space from where a parked car has to be fetched and brought back to the ground floor. A relay panel board is used for switching the low voltage IR sensors output voltage to the 24 V DC which will be used by the programmable logic controller and viceversa and a voltage divider circuit is used for converting the output voltage of 24 V of the PLC to a voltage of nearly 5V which is used by the sensor. The ladder logic diagram developed by using IndraWorks software has been implemented. This logic is downloaded in the firmware flashcard. The Programmable Logic Controller was able to control and manage all the functions of the system such as finding out the empty space from the inputs of the proximity sensors, making the transmitter of the corresponding floor and ground floor turn on and off and also making the DC Motor rotate in both clockwise and anticlockwise direction. The features that can be added to the system are: - Instead of using the screw jack mechanism, hydraulics can be used for reducing the lifting time. - Instead of using mechanical arm, fork lift can be used. - Security system can be added to it. #### REFERENCES - [1] Vanessa W.S. Tang, Yuan Zheng, Jiannong Cao, "An Intelligent Car Park Management System based on Wireless Sensor Networks", 1st International Symposium on Pervasive Computing and Applications, 2006 - [2] Hamada R. H. AI-Absi, Patrick Sebastian, Justin Dinesh Daniel Devaraj, Yap Yooi Yoon, "Vision-Based Automated Parking System", 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010) - [3] E.L. Dereniak and G. D. Boreman, "Infrared Detectors and Systems", Wiley Series in Pure Optics and Applied Optics, Wiley- Interscience, 1st Edition, April 1999, ISBN-10:0471122092 - [4] John David Vincent, Tom Vincent, "Fundamentals of Infrared Detector Operation and Testing", Wiley- Interscience, 1st Edition, 1990, ISBN 0471502723 - [5] Staley H. Horowitz, Arun G. Padke, "Power System Relaying", John Wiley & Sons Inc., 3rd Edition, 2008, ISBN-10:0470057122 - [6] Robert L. Boylestad, Louis Nashelsky, "Electronic Devices and Circuit Theory", Pearson, 10th Edition, 2009, ISBN 978-81-317-2700-3 - [7] Rexroth Bosch Manual 2011 - [8] Garry Dunning, "Introduction to Programmable Logic Controllers", CENGAGE Learning, 3rd Edition, December 2005, ISBN-10:1401884261 - [9] B. L. Theraja, A. K. Theraja, "A Textbook of Electrical Technology", Sultan Chand & Sons, 1st Multicolour Edition, Volume II, 2005, ISBN 81-219-2437-5 # Virtual Analog and Digital Effects Models for Vocal and Musical Sound Synthesis in Real Time # Dony Armstrong D'Souza, Sunitha Lasrado & Ganesh V.N Electronics and communication Department N.M.A.M.I.T Nitte. E-mail: dony.armstrong@gmail.com, ganesh@mail.mite.ac.in Abstract - In this paper we present a novel musical sound effects proceesing system based on virtual analog modelling and Digital Signal Processing techniques. The coding is done in matlab and the sequence of effects are sequenced depending on the musicians choice. The various concepts of filtering in Digital signal processing are used. The results obtained are compared with the commercially available system Keywords - Fuzz, Flanger, Chorus, Delay, State variable filter. #### I. INTRODUCTION The music industry uses various sound effects which are used in various audio tracks or movies to augment the experience of movie watching, audio listening, public address systems and live music concerts. Real-time musical effects processing and synthesis play a part in nearly all musical sounds encountered in the contemporary environment. Virtually all recorded or electrically amplified music in the last few decades uses effects processing, such as artificial reverberation or dynamic compression, and synthetic instrument sounds play an increasingly larger part in the total musical spectrum. Furthermore, the vast majority of these effects are presently implemented using digital signal processing (DSP), mainly due to the flexibility and low cost of modern digital devices. For live music, real-time operation of these effects and synthesis paramount algorithms is obviously of importance. However, also recorded music typically requires real-time operation of these devices and algorithms, because performers usually wish to hear the final, processed sound of their instrument while playing. recent while reviews of virtual analog modeling and digital sound synthesis can be found in articles [3] and [4] respectively. A tutorial on virtual analog oscillator algorithms, which are not tackled in this paper, has been written by Välimäki and Huovilainen [5]. Real-time simulation of an interesting analog effects device, the voltage-controlled filter, In this proposed project the importance is given for the versatile, hussel free, portable and easy to utilize musical synthesizer which is the prime requirement of the modern day musicians. The proposed system is to be developed by an DSP processor or by utilizing the computer's line—in or Micin ports. The High speed processing capabilities of the computer is nowadays used for these kind of applications. where in the live streaming of audio has to be converted to digital data, then given to computer where it is processed and given out through line-out port of the computer. The commercial available music processors do not have the flexibility to interchange the chain of effects within the loop in a required order. This proposed system overcomes this limitation by independently running the effects and good memory or register banks. # II. SYSTEM OVERVIEW Fig. 1: Block diagram of the proposed system The proposed system consists of Musical instrument with the transducer which may be a pick up or piezo electric or the out put of any musical instrument with electrical output like aux or mic output. The ADC and Buffer unit for the signal to be digitized and the buffer is used for the storage of data in order to have an easy flow of the sequence of data from ADC to the next stage. The DSP processor to do the processing of the signal and then buffer and DAC and then to impedance matching and Amplifier section and to speakers .The proposed system can also utilize the Computer with very high speed processing to run the program in matlab and do the processing in real time. The power supply unit supplies the power to the required sections. The computer approach consists of running Matlab or octave with streaming capability got by using tool like playrec. #### III. METHODOLOGY The proposed System incorporates the various musical sound effects in real time into a digitally organized system in which the shortcomings of analog version can be irradicated. The basic construction of the project is proposed by keeping the following design objectives in mind. - 1. To be a versatile and reliable system. - To have a system which will be of low cost very compact ,organized and the option of many in one kind of gadget. - 3. To be able to process data in real time with the latency being negligible to human ear - 4. Low power consumption - 5. Smaller in size and should be portable. The Methodology involves basically 5 steps; # A. Data Acquisition The data to be acquired is the signal from the musical instrument and then this may converted to digital by using a flash ADC with high resolution. The input can be directly fed to computer mic or line-in port. #### B. Data storage Data storage is required in order to have proper feeding of Data to the DSP processor and from processor to DAC . #### C. Selection of Desired effects The effects required have to be selected and the order must be specified an term as the effects in the sequence as in Fig 2. This is done by having a switch and choice statements in the program and it switches the sequence of the functions to be executed. #### D. Processing The processing used for the effects are for effects like fuzz[1],wahwah[1],delay[1],chorus[1], Flanger[1], Modulator[1] Can be modelled by the algorithms in [1]. The function of the fuzz is given in [2] by the equation as $$f(x) = \{2x \text{ for } 0 \le x \le 1/3\}$$ $$f(x) = [3-(2-3x)^2]/3 \text{ for } 1/3 \quad x = 2/3 \qquad (1)$$ $$f(x) = 1 \text{ for } 2/3 \quad x = 1$$ Here the equation is given and the matlab program written for it gives the vaues for the function for the array of the audio signal to be processed. The wah wah effect is obtained by the shifting the band of a band pass filter over the spectrum by the state variable filters using the equation, $$Y_{l}(n) = F_{l} y_{b}(n) + y_{l}(n-1)$$ $$Y_{b}(n) = F_{l} h(n) + y_{b}(n-1)$$ $$Y_{h}(n) = x(n) - yl(n-1) - Q_{l} y_{b}(n-1)$$ (2) Here FI and QI are tuning coefficients related to cut-off frequency fc and damping d. y1(n) is the low pass signal.yh(n) is the highpass signal,yh(n) is the bandpass signal and x(n) is the input signal. where $$F_1=2\sin(f_c/f_s)$$ and $Q_1=2d$ (3) #### IV. RESULTS AND ANALYSIS The results of three effects generated are obtained from a 18sec clip played by an guitar which is captured by windows sound recorder and processed. The same clip is played through the commercial synthesizer. Fig. 2: The input audio clip The input clip shown Fig. 2 is of duration 18sec and is played from an electric guitar and captured through windows sound recorder utility and converted to wavefile. Fig. 3 Simulated result for the fuzz effect using the proposed system The waveform shown in Fig. 3 is the simulated result from the proposed system with the same input clip of Fig.1. Fig. 4: The fuzz effect out of Digitech RP The waveform of Fig. 4 is the waveform obtained from Digitech guitar sound processor for the same note played as the input. Fig. 5: The wah wah effect from proposed system The Waveform of Fig. 5 is the wah wah effect generated using a state variable filter with sweeping frequency varying from 500 to 5kHz. Fig. 6: The wah wah effect from proposed Digitech The waveform Fig. 6 is the wah wah effect generated by the Digitech music processing unit and the input signal is the same as Fig 1.which is captured with windows soundrecorder utility. Fig. 7: The flanger effect out of proposed project. The waveform of Fig 7 is obtained from the Flanger effect by executing the file in matlab for the input waveform of the same sound clip of above examples Fig. 8: The expander effect of proposed project. The Waveform of Fig 8 shows the effect of Expander signal by expanding the wave over a values by the matlab program in the proposed project. Fig. 9: The compressor effect of proposed project. The Waveform of Fig 9 shows the effect of Compressor signal by compressing the values above threshold by the matlab program in the proposed project. Fig. 10: The Overdrive Distortion effect obtained by the proposed project. Fig 10 shows the waveform obtained by symmetrically clipping with gain moderate gain. #### V. CONCLUSION AND DISCUSSION As seen by the waveforms the results obtained aare Fig. 3 and Fig 4 and satisfactory.By comparing listening to the audiowave file obtained it is seen that the clarity is better in proposed system and the other effect wah wah from Fig 5 and Fig 6 it is seen that there is clarity in proposed system waveform and the audio generated.The further effects flanger, delay, chorus, reverberation be are tested.Further the virtual analog models using analog equations have to be tested and implemented. The power for the proposed system is the regular power supply used with less power consumption in mind. . #### REFERENCES - [1] U. Z'olzer, Ed, DAFX—Digital Audio Effects, John Wiley &Sons, New York, NY, USA, 2002. - [2] J. Pakarinen, Vesa valimaki, Federico Fontana, Victor Lazzarani, and Jonathan S.Abel "Recent Advances in Real-Time Musical effects, Synthesis, and Virtual Analog Models".Journal volume 2011, article ID940784. Sound and Music technology Research Group, National university of Ireland Maynooth, Ireland. - [3] J. O. Smith, "Physical Audio Signal Processing," 2010,https:// ccrma.stanford.edu/@jos/pasp/. - [4] V. V"alim"aki, F. Fontana, J. O. Smith, and U. Z"olzer, "Introduction to the special issue on virtual analog audio effects and musical instruments," IEEE Transactions on Audio,Speech and Language Processing, vol. 18, no. 4, pp. 713–714, 2010. - [5] V. V"alim"aki, J. Pakarinen, C. Erkut, and M. Karjalainen, "Discrete-time modelling of musical instruments," Reports on Progress in Physics, vol. 69, no. 1, pp. 1–78, 2006. - [6] V. V"alim"aki and A. Huovilainen, "Antialiasing oscillators in subtractive synthesis," IEEE Signal Processing Magazine, vol. 24, no. 2, pp. 116– 125, 2007. - [7] J. Pakarinen, H. Penttinen, V. V"alim"aki et al., "Review of sound synthesis and effects processing for interactive mobile applications," Report 8, Department of Signal Processing and Acoustics, Helsinki University of Technology, 2009 - [9] C. Poepel and R. B. Dannenberg, "Audio signal driven sound synthesis," in Proceedings of the International ComputerMusic Conference, pp. 391–394, Barcelona, Spain, September 2005. - [10] V. Lazzarini, J. Timoney, and T. Lysaght, "The generation of natural-synthetic spectra by means of adaptive frequency modulation," Computer Music Journal, vol. 32, no. 2, pp. 9–22, 2008. - [11] J. Pekonen, "Coefficient modulated first-order allpass filter as distortion effect," in Proceedings of the International Conference on Digital Audio Effects, pp. 83–87, Espoo, Finland, September 2008. - [12] J. Kleimola, J. Pekonen, H. Penttinen, V. V"alim"aki, and J. S. Abel, "Sound synthesis using an allpass filter chain with audio-rate coefficient modulation," in Proceedings of the - International Conference on Digital Audio Effects, Como, Italy, September 2009. - [13] C. M. Cooper and J. S. Abel, "Digital simulation of brassiness and amplitude-dependent propagation speed in wind Instruments," in Proceedings of the International Conference on Digital Audio Effects, Graz, Austria, September 2010. - [14] . Ren Gang,Gregory Bocko,Justin Lundberg and Stephen Rossener "A Real-Time signal processing Franmework of Musical Expressive Feature Extraction using MATLAB" 12<sup>th</sup> International Society for music information retrieval Conference (ISMIR 2011) . # Performance Analysis of Solar Cell Powered Z-Source Inverter System #### Nisha K.C.R, T.N.Basavaraj, Neet Kumar Mahto, Nikhil Kumar & Pranav Anand New Horizon College of Engineering, Bangalore, India E-mail: nishashaji2007@gmail.com Abstract - This paper presents a high performance, low cost inverter for Photo voltaic systems based on Z-source concept. Traditional Voltage-source inverter and Current Source Inverter has improved to the new Z-Source Inverter, with a unique X-shaped network in it. This impedance source inverter can provide a single stage power conversion concept where as the traditional inverter requires two stage power conversion for renewable energy applications. A new low cost solar cell powered Z-source inverter system is simulated and the results are compared with the traditional Voltage Source Inverter system. Performance analysis confirms that Z-source inverter system is more appropriate for photo voltaic applications. Keywords - Voltage-source inverter, photo voltaic, Z-source inverter. #### I. INTRODUCTION As people are much concerned with the fossil fuel exhaustion and the environmental problems caused by the conventional power generation, an application of renewable energy resources has higher attractive; more specially, photovoltaic cell. The solar cell technologies have been developed to improve the efficiency and the production cost of a photovoltaic cell has been reduced. Solar cells are used today in many applications such as battery charging, satellite power systems etc. They have the advantage of pollution free and less maintenance cost. But their installation cost is high and in most applications, they require a power conditioner (dc/dc or dc/ac converter) for load interface. Since photo voltaic modules still have relatively low conversion efficiency, the overall system cost can be reduced by using high efficiency power conditioners [1]. Power electronics inverters for renewable energy utilization applications would require both voltage buck and boost capabilities for riding through load current and supply voltage variation. A common way of implementing buck-boost inverter is to cascade a dc-dc converter to either a buck voltage source or boost current source inverter to form a two stage power conversion solution but this cascaded topology usually gives rise to increased system complexity and reduced reliability [2]. Conventional VSI and CSI support only current buck DC-AC power conversion and need a relatively complex modulator [3]. As an alternative, the single stage Z-source inverter is proposed in [4], where it is explicitly shown that the Z-source inverter gains its voltage tuning flexibility by introducing a unique LC impedance network between its input source and inverter circuitry. Besides flexible gain tuning the inserted impedance network is stated to have the advantage of protecting the inverter phases from short circuit damages even with no dead time delay inserted [5]. The Z-source inverter is attractive for three main reasons; first the traditional PWM inverter has only one control freedom, used to control the output AC voltage. However the Z-source inverter has two independent control freedoms; shoot through duty cycle and modulation index, providing the ability to produce any desired output AC voltage [6]. Second, the Z-source inverter provides the same features of a DC-DC boosted inverter, yet its single stage is less complex and more cost effective. Third, the z-source inverter has the benefit of enhanced reliability due to the fact that momentary shoot-through can no longer destroy the inverter [7]. #### II. TRADITIONAL Z-SOURCE INVERTER The conventional Z-source inverter is shown in Fig. 1. It employs a symmetrical LC impedance network to replace the dc -link capacitor in traditional VSI. Furthermore, with the help of series diode D embedded in the source side, the input dc source can be effectively disconnected from the Z-source network by naturally reverse-biasing the diode D during the unique shoot-through interval, which can be initiated by turning ON all switches of one phase-leg simultaneously. The 3-phase Z-source inverter bridge has nine permissible switching states unlike the traditional three phase Voltage- Fig. 1: Z –source inverter source inverter that has eight states. The traditional three-phase V-source inverter has six active vectors when the load terminals are shorted through either the lower or upper three devices, respectively. However, the three-Phase Z- source inverter bridge has one extra state when the load terminals are shorted through both the upper and lower devices of any one phase leg (i.e., both devices are gated on), any two phase legs, or all three phase legs. This shoot-through zero state is forbidden in the traditional Voltage-source inverter. Such special operation provides the ability of voltage boosting as well as the unidirectional power conversion (desired in PV and fuel cell systems) [8], [9]. ### III. CIRCUIT ANALYSIS AND OBTAINABLE OUTPUT VOLTAGE Assuming that the inductors L1 and L2 and capacitors C1 and C2 have the same inductance (L) and capacitance(C) respectively, the Z-source network becomes symmetrical. From the symmetry and equivalent circuits we have $$V_{L1} = V_{L2} = V_{L}$$ $V_{C1} = V_{C2} = V_{C}$ (1) Fig. 2: Equivalent circuits of Z-source inverter (a) non shoot-through (b) shoot-through states Shoot-Through (Sx = Sx' = ON, x = A, B or C; D = $\overline{OFF}$ ) $$v_L = V_C, v_i = 0, v_d = 2V_C, v_D = V_{dc} - 2V_C$$ (2) $$iL = -iC; i_i = iL - iC; i_{dc} = 0$$ (3) Nonshoot-Through (E.g $Sx \neq Sx$ ', x = A, B or C; D = ON) $$v_L = V_{dc} - V_{C}$$ ; $V_C$ ; $v_i = 2V_C - V_{dc}$ ; $v_d = V_{dc}$ ; $v_D = 0$ ; (4) $$i_{dc} = i_L + i_C; i_i = i_L - i_C; i_{dc} \neq 0$$ (5) Averaging the inductor voltage to be zero, the capacitor voltage Vc, peak DC-link voltage $v_{i1}$ and peak ac output voltage $v_{x1}$ (x=a, b or c) can be derived as: $$\begin{cases} V_{C} = \frac{1 - T_{0} / T}{1 - 2T_{0} / T} V_{dc} \\ v_{i1} = \frac{V_{dc}}{1 - 2T_{0} / T} = BV_{dc} \\ v_{x1} = \frac{MV_{dc}}{2((1 - 2T_{0} / T))} = B\left(\frac{MV_{dc}}{2}\right) \end{cases}$$ (6) Where M refers to the conventional modulation index, B represents the boost factor induced by shoot-through operation and $T_0/T$ < 0.5 defines the shoot-through duty ratio. #### IV. SIMULATION RESULT Solar cell powered Z-source inverter is modeled and simulated using MATLAB/SIMULINK package. Solar cell powered Z-source inverter system is shown in Fig. 3a. Z- filter is introduced between the source and inverter. The inverter feeds a 3 phase motor load. Solar cell is modelled and the input voltage is shown in Fig. 3b.Driving pulses are shown in Fig. 3c.The diode voltage is shown in Fig. 3d.The pulse width modulated line voltages are shown in Fig. 3e.They are displaced by 120 degrees. The line currents are shown in Fig. 3f. They are also displayed by 120 degrees. Fig. 3g. shows rotor speed. Fig. 3h shows the FFT analysis of Voltage-source inverter and the THD is 12.933%. Fig. 3i. shows the FFT analysis of Z-source inverter, and the THD is reduced to 8.12.933%. Fig. 3(a) Three phase Z-source inverter circuit Fig. 3(b) Solar input voltage Fig. 3(c) Driving pulses M3 and M6 Fig. 3(d) Diode voltage Fig. 3(e) Line voltage Fig. (3f)Line current Fig. 3(g) Rotor speed Fig. 3(h) FFT Analysis for current in Voltage-source inverter Fig. 3(i) FFT Analysis for current in Z-source inverterI #### V. CONCLUSION In this paper solar cell powered Z-source source inverter system is modelled and simulated. The single stage Z-source inverter has both voltage buck boost capabilities due to its unique impedance network within it. Z-source network does not need a dead time leads to improved performance. It also has a wide range of input voltage that results in low power losses.FFT analysis is done and the spectrum is obtained. The results of digital simulation are presented. It also confirms that the THD of Z-source inverter system is very less than its counterpart and it is very much promising power conversion concept for photo voltaic system in order to increase the overall system efficiency ,thereby reducing system complexity and cost. #### REFERENCES - [1] Eftichios Koutroulis, Kostas Kalaitzaki and Nicholas C.Voulgaris, Development of a Microcontroller-Based, Photovoltaic Maximum Power Point Tracking Control System, IEEE Trans. On Power Electrons Vol.16, No.1, Jan. 2001, pp 46 54. - [2] H.Rostami and D.A.Khaburi, Neural Networks Controlling for Both the DC Boost and AC Output Voltage of Z-Source Inverter, IEEE Conference, 2010, pp 135 – 140. - [3] Rathika, S.; Kavitha, J.; Paranjothi, S.R., "Embedded control Z-source inverter fed induction motor": IEEE Conference, INCACEC 2009. Pp 1-7. - [4] F. Z. Peng, "Z-source inverter," IEEE TRAN. IND. APPL., VOL. 39 NO2, PP. 504-510, MAR./APR. 2003 - [5] Tang, Y., Xie, S.J., Zhang, C.H., Xu, Z.G.: 'Improved Z-source inverter with reduced Z-source capacitor voltage stress and soft-start capability', IEEE Trans. Power Electron., 2009, 24, (2), pp. 409–415 - [6] Shen M S, Wang J, Joseph A, Peng F Z, Tolbert L M, and Adams D J, "Constant boost control of the Z-sourcenverter to minimize current ripple and voltage stress," May-Jun 2006, IEEE Transactions on Industry Applications, vol42, pp. 770-778. - [7] N.Vidhyarubini , G.Rohini ,"Z-source inverter for photovoltaic power generation system "proceedings of ICETECT 2011 - [8] F. Z. Peng, M. Shen, and K. Holland, "Application of Z-source inverter for traction drive of fuel cell—battery hybrid electric vehicles," IEEE Trans. Power Electron., vol. 22, no. 3, pp. 1054-1061, May 2007. - [9] Y. Huang, M. Shen, F. Z. Peng, and J. Wang, "Z-source inverter for residential photovoltaic systems," IEEE Trans. Power Electron., vol. 21, no. 6, pp. 1776-1782, Nov. 2006. #### Single Event Upset Correction for Satellite Images by using AES #### Praveen.H.L<sup>1</sup>, H.S Jayaramu<sup>2</sup> & M.Z.Kurian<sup>3</sup> <sup>1&3</sup>Department of Electronics & Communication, <sup>2</sup>Department of Telecommunication, <sup>1,2&3</sup>Sri Siddhartha Institute of Technology, Tumkur, Karnataka, India. E-mail: praveee79@gmail.com, jramssit@yahoo.co.in, mzkurianvc@yahoo.com Abstract - Data hiding has been used for thousands of years to transmit data without being intercepted by unwanted viewers, therefore security becomes increasingly important for many applications, such as confidential transmission, video surveillance, military and medical etc. Advanced Encryption Standard provides the highest level of security by utilizing the newest and strongest 128 bit AES encryption algorithm to encrypt and authenticate the data. At the same time while encryption process immunity of encryption is taken into the account. Five modes of AES have been used to perform security on satellite data. The AES is a symmetric key algorithm in which both the sender and the receiver use a single key for encryption and decryption. An analysis of the propagation of faults that occur during transmission due to noise is carried out in order to avoid data corruption due to Single Event Upset's, the faults are rectified by using Hamming Error Correction code Algorithm. This reduces the data corruption and increases the performance as a result we can identify the error and also we can encrypt the image as color and thus by using Advanced Encryption Standard algorithm for use On board Earth Observation small satellites throughput can be increased and data corruption can be reduced. **Keywords** - Earth observation satellite, Output feedback mode, Single event upset, Hamming error correction code, Advanced encryption standard. #### I. INTRODUCTION An Earth observation (EO) satellites are satellites specifically designed to observe Earth from orbit this Earth Observation satellite takes images on earth by using the image sensors. Earth Observation satellites were used more effectively in disaster management support. Today meteorological satellites are widely used to detect and track severe storms and to support other whether-driven events [12]. Security services are much needed to protect the data from unauthorized access while transmitting data from satellites. On-Board encryption is used to secure such valuable data. And also it avoids intrusion. Number of satellite uses on-Board Encryption technique to protect data while transmission to ground. To protect satellite images some cryptographic techniques are used. To provide high security Advanced Encryption Standard (AES) is used which is approved by NIST. AES is a block cipher. AES is used in different application since it provides simplicity, flexibility, easiness of implementation and high throughput. AES is also suitable for hardware based implementations. AES achieves high throughput. At the same time while encryption process, immunity of encryption is taken into the account. i.e. encryption process against fault. So to overcome the above problem number of approaches where made. ### II. SATELLITE IMAGES AND EXISTING SYSTEM A digital image is defined as a two dimensional rectangle array. The elements of this array are denoted as pixels. Each pixel has an intensity value (digital number) and a location address (row, column). A satellite is an object that orbits another object, the term is often used to describe an artificial satellite[12]. The satellite images provide a variety of information, the satellites pass on information to the base center on the planet through telephonic messages, pictures from satellite TV and emergency snap shots retrieved from ships and aircraft. The satellite images are generated with the intent of creating an imaging network for even the most inhospitable regions on land and the oceans [14]. Satellite operates in harsh radiation which uses On-Board encryption processor. It is susceptible to radiation induced faults. The fault occurs in satellite On-Board devices are called as Single Event Upset [1]. If faulty data occurs then satellite needs to wait for long time to receive next data. To prevent this error free encryption scheme is proposed in On-Board. Advantage for this is to provide error-free encryption system and error is much more reduced even in radiation in satellites. Disadvantage for this is that the data is further corrupted while transmission due to noise. Reliability is more important in avionics design. SEU must be detected and corrected while sending data to the ground. The Triple Modular Redundancy (TMR) technique is used. TMR consists of 3 identical modules which is connected to the majority voting circuit. Advantage for this is that SEU is detected and rectified before sending the data to the ground. Disadvantage is that computation overhead compared to the existing and it provides less security [3]. #### III. PROPOSED SYSTEM To overcome the drawbacks which are shown in the existing system, this proposed system uses a new fault tolerant technique based on AES. To address the reliability issues of AES algorithm and to overcome the SEU. Five modes are used in AES. They are Cipher block chaining mode (CBC), Electronic code Book mode (ECB), Cipher Feedback Mode (CFM), Counter mode (CTR) and Output Feedback mode (OFB). Cipher Block Chaining is not suitable for satellite images because data is corrupted due to fault propagations. In Electronic Code Book if a single bit is corrupted the entire block is corrupted. In cipher Feedback mode the fault is propagated to next blocks. No fault is propagated in counter mode. And also satellite image communications are not suitable for counter mode. So to rectify the faults while transmission of data from satellites in noise an On-Board AES OFB based encryption is used. The faults are rectified by using Hamming Error Correction code Algorithm. The proposed approach reduces the SEU while transmission of data from satellites with noise. #### IV. ADVANCED ENCRYPTION STANDARD A modern branch of cryptography also known as public-key cryptography in which the algorithms employ a pair of keys (a public key and a private key) and use a different component of the pair for different steps of the algorithm. The AES algorithm is a symmetric-key cipher, in which both the sender and the receiver use a single key for encryption and decryption. The data block length is fixed to be 128 bits, while the key length can be 128, 192, or 256 bits, respectively. In addition, the AES algorithm is an iterative algorithm. Each iteration can be called a round, and the total number of rounds is 10, 12, or 14, when the key length is 128, 192, or 256 bits, respectively. The 128-bit data block is divided into 16 bytes. These bytes are mapped to a 4\*4 array called the State, and all the internal operations of the AES algorithm are performed on the State. Each round in AES, except the final round, consists of four transformations: SubBytes, ShiftRows, MixColumns, and AddRoundKey. The final round does not have the MixColumns transformation. The decryption flow is simply the reverse of the encryption flow and each operation is the inverse of the corresponding one in the encryption process. Fig.1: The AES Algorithm (a) Encryption Structure. #### (b) Equivalent Decryption Structure The round transformation of AES and its steps operate on intermediate results, called state. The state can be visualized as a rectangular matrix with four rows. The number of columns in the state is denoted by Nb and is equal to the block length in bits divided by 32. For a 128 bit data block (16 bytes) the value of Nb is 4, hence the state is treated as a 4\*4 matrix and each element in the matrix represents a byte. For the sake of simplicity, in the rest of the paper, both the data block and the key lengths are considered as 128 bit long. However all the discussions and the results hold true for 192 bit and 256 bit keys as well. By using AES algorithm the color image is encrypted and sends to the receiver side. The receiver again decrypts the image and gets the original color image. The encrypted image by using AES is shown in the following Fig.2 (b). Fig 2. (a) Fig. 2 (b) Fig. 2: (a) Original Image (b) Encrypted Image The AES encryption algorithm accepts one data block and the key and produces the encrypted data block. The input and output data blocks are of identical size. The decryption algorithm accepts one encrypted data block and the key to produce the encrypted data block have been defined to apply the AES block cipher to encryption of more than one 128 bit block of data. ### A. Notations , Conventions and Mathematical Background The input and output for the AES algorithm consists of sequences of 128 bits. These sequences are referred to as blocks and the numbers of bits they contain are referred to as their length. The Cipher Key for the AES algorithm is a sequence of 128, 192 or 256 bits. The basic unit of processing in the AES algorithm is a byte, which is a sequence of eight bits treated as a single entity. $$\begin{array}{l} b_7\,x^8 + b_6\,x^7 + b_5\,x^6 + b_4\,x^5 + b_3\,x^4 + b_2x^3 + b_1\,x^2 + b_0 \\ x = & \sum b_i x^i \end{array}$$ Internally, the AES algorithm's operations are performed on a two-dimensional array of bytes called the State. The State consists of four rows of bytes. Each row of a state contains Nb numbers of bytes, where Nb is the block length divided by 32. In the State array, which is denoted by the symbol S, each individual byte has two indices. The first byte index is the row number **r**, which lies in the range $0 \le r \le 3$ and the second byte index is the column number c, which lies in the range 0 $\leq c \leq Nb-1$ . Such indexing allows an individual byte of the State to be referred to as $S_{r,c}$ or S[r,c]. At the beginning of the Encryption and Decryption the input, which is the array of bytes symbolized by in<sub>0</sub>in<sub>1</sub>···in<sub>15</sub> is copied into the State array. The Encryption or Decryption operations are conducted on the State array[6]. Every byte in the AES algorithm is interpreted as a finite field element using the notation. All Finite field elements can be added and multiplied. The addition of two elements in a finite field is achieved by "adding" the coefficients for the corresponding powers in the polynomials for the two elements. The addition is performed through use of the XOR operation. Multiplying the binary polynomial defined in equation with the polynomial *x* results, can be implemented at the byte level as a left shift and a subsequent conditional bitwise XOR with {1b}. This operation on bytes is denoted by xtime(). Multiplication by higher powers of x can be implemented by repeated application of xtime (). Through the addition of intermediate results, multiplication by any constant can be implemented [9]. #### B. Output Feedback Mode In the OFB mode the output of the encryption is fed back into the input to generate a keystream, which is then XOR-ed with the plain data to generate the cipher data. If an SEU occurs during encryption in the OFB mode then all the subsequent blocks will be corrupted starting from the point where the fault has occurred .This is because the keystream required for encryption and decryption is independent of the plain and cipher data and hence the feedback propagates the faults from one block to another until the end of the encryption process. This is demonstrated by introducing an SEU during the encryption of a plain multispectral satellite image[6]. Fig 3 : SEU propagation during encryption in OFB mode #### V. SYSTEM DESIGN Initially the image is captured and then converted into the text file by using the MATLAB where the converted text file is used as the input. The obtained file is encrypted by using the Advanced Encryption. The AES is a symmetric key algorithm, in which both the sender and the receiver use a single key for encryption and decryption. AES defines the data block length to 128 bits Standard, where the fault tolerance is checked. The error value should be such that it should not disturb the image. Fault tolerance is done to check the error values .Lower the value of error higher will be the reliability and the performance. Error level has to be low at the initial stages itself or else it will affect the quality of images received at the receiver side. AES is carried out at the transmitter part itself. The proposed fault tolerant model is based on single error correcting Hamming code. The Hamming code detects and corrects a single bit fault in a byte and it is a good choice for satellite applications, as most frequently occurring faults in onboard electronics are bit flips induced by radiation. However, the AES correction model can be extended to correct multiple bit faults by using other error correcting codes such as the modified Hamming code. Thus by using AES algorithm it is possible to get an encrypted image which reduces the error and also prevents the intrusion and this is then send to the receiver side. The receiver decrypts the encrypted image and gets the original image .At the receiver side image decryption is carried out by using AES itself .The key used for encryption and decryption must be the same. If they are different image will be lost. In decryption reverse of encryption is carried out. The input for decryption is the cipher text which is the output of AES encryption. This is then decrypted to get the output plain text. The output plain text is then converted to the original image by using MATLAB. Fig. 4: Block diagram of AES #### A. Design rationale The three criteria taken into account in the design of AES are the following: - Resistance against all known attacks. - Speed and code compactness on a wide range of platforms. - Design simplicity. #### B. Implementation Aspects VHDL is used as the hardware description language because of the flexibility to exchange among environments. The code is pure VHDL that could easily be implemented on other devices, without changing the design. The software used for this work is Synthesis Tool Xilinx 12.2. This is used for writing, debugging and optimizing efforts, and also for fitting, simulating and checking the performance results using the simulation tools available on Modelsim6.3c, Matlab. #### VI. FAULT TOLERANT MODEL A novel fault-tolerant model for the AES algorithm, which is immune to radiation-induced SEUs occurring during encryption can be used in hardware implementations on on-board small OE satellites [5]. The model is based on a self-repairing EDAC scheme, which is built in the AES algorithmic flow and utilizes the Hamming error correcting code [6]. The proposed Hamming code based fault-tolerant model of AES can be adapted to all the five modes of AES to correct SEUs on board. Even though the calculation of the Hamming code is carried out within the AES it does not alter any of the transformations of the algorithm and does not affect in any way the operation of AES. Also as the Hamming parity data are not sent to ground, they are not available to leak any information about the AES algorithm. Therefore the fault tolerant AES model does not require a new cryptanalysis. Fig. 5: Fault detection and correction flow chart #### A. Model Description The proposed fault-tolerant model is based on the single error correcting Hamming code (12,8), the simplest of the available error correcting codes. The Hamming code (12,8) detects and corrects a single bit fault in a byte and it is a good choice for satellite applications, as most frequently occurring faults in onboard electronics are bit flips induced by radiation[8]. However, the AES correction model can be extended to correct multiple bit faults by using other error correcting codes such as the modified Hamming code. 1) Calculation of the Hamming Code: The parity check bits of each byte of the S-Box LUTs are precalculated. These Hamming code bits can be formally expressed as below: $$h(SRD[a]) \rightarrow hRD[a]$$ $$h((SRD[a] f\{2g\}) \rightarrow h2RD[a]$$ $$h((SRD[a] f\{03g\}) \rightarrow h3RD[a]$$ (1) where "a" is the state byte and "h" represents the calculation of the Hamming code. As can be seen from (1), hRD is given by the parity check bits of the S-Box LUT SRD, h2RD is given by the parity check bits of (SRD – f02g), and h3RD is given by the parity check bits of (SRD – f03g). The procedure to derive the hRD parity bits is described below by taking one state byte a, represented by bits (b7,b6,b5,b4,b3,b2,b1,b0) as an example. The Hamming code of the state byte a is a four-bit parity code, represented by bits (p3,p2,p1,p0), which are derived as follows: p3 $$\rightarrow$$ is parity of bit group b7,b6,b4,b3,b1 p2 $\rightarrow$ is parity of bit group b7, b5, b4, b2, b1 p1 $\rightarrow$ is parity of bit group b6, b5, b4, b0 p0 $\rightarrow$ is parity of bit group b3,b2, b1, b0 (2) #### 2) Detection and Correction of Fault Using Hamming Code Bits: The Hamming code matrix of the SubBytes transformation is predicted by referring to the hRD table. The Hamming code matrix prediction for ShiftRows involves a simple cyclic rotation of the SubBytes Hamming code bits[8]. The Hamming code state matrix for MixColumns is predicted with the help of the hRD, h2RD and h3RD parity bits and it is expressed by the equations below: $$h3,j = h3RD[a0,j] hRD[a1,j] hRD[a2,j] h2RD[a3,j]$$ 0 < j <4 (3) Hamming code is predicted using the input data state to the transformation by referring to the parity check bit tables and also the parity check bits are calculated from the output of the transformation. The predicted and calculated check bits are compared with detect and correct the fault as discussed below, Let the predicted check bits of the transformation input be represented by (x3,x2,x1,x0) and the calculated check bits of the transformation output be represented by (y3,y2,y1,y0). Once the faulty bit position is identified, the fault correction is performed by simply flipping that bit. The encryption is then continued without any interruption to the encryption process. Here we assume that the Hamming code tables will be protected from SEUs by traditional memory protection techniques in satellite applications like memory scrubbing and refreshing [7]. #### VII. CONCLUSION The Image encryption standard alone is not so efficient to protect the integrity of the image. Because of which we are facing lot of issues regarding the images such as security of the data from various source of image generation, the single event upset computation overhead.. By using the Advanced Encryption Standard algorithm the single Event Upset problem can be entirely eliminated and the fault toleration can be achieved. The reliability and the integrity of the data can ensured with high accuracy and image compatibility. The reliability for the images ensure that in future the AES can be implemented for the video processing and security of the videos also. The proposed fault detection and correction AES model targets the satellite application domain, however it can also be used in other applications aimed at hostile environments such nuclear reactors, interplanetary exploration, unmanned aerial vehicles, etc. #### ACKNOWLEDGMENT I am deeply indebted and would like to express my sincere thanks to our beloved Principal Dr. K.A.Krishnamurthy, for providing me an opportunity to do this project. My special gratitude to Dr. M.Z.Kurian, HOD, Department of E & C, S.S.I.T for his guidance, constant encouragement and wholehearted support. My sincere thanks to my guide Prof. H.S.Jayaramu, HOD, Department of Telecommunication for his guidance, constant encouragement and wholehearted support. #### REFERENCES - A. M. Finn "and , Silver Lane" System effects of single event upset": Monterey, CA, October 3-5, 1989. - [2]. B.Subramanyan. , Vivek.M.Chhabria .,and T.G.Sankar babu, "International Conference on Emerging Applications of Information Technology": 978-0-7695-4329-1/11 © 2011. - [3] R. E. Lyons., and W. Vanderkul k, "The Use of Triple-Modular Redundancy to Improve Computer Reliability" IBM journal April 1962. - [4] Y. Bentoutou "World Academy of Science, Engineering and Technology ":77 2011. - [5] Kastensmidt, F. L., Carro, L., and Reis, R. Fault-Tolerance Techniques for SRAM-Based FPGAs. - [6] Luby, M. G., Mitzenmacher, M., Shokrollahi, M. A., and Spielman, D. A. Efficient erasure correcting codes. IEEE Transactions on Information Theory, 47, 2 (Feb. 2001), 569—583. - [7] Daemen, J., and Rijmen, R. The Design of Rijndael: AES-The Advanced Encryption Standard. New Yrok: Spriger-Verlag, 2002.New York: Springer, 2006. - [8] Roohi Banu, Tanya Vladimirova, "Fault-Tolerant Encryption for space application" IEEE Transactions on Aerospace and Electronic Systems Vol..45,No.1,Jan 2009 - [9] FIPS 197, "Advanced Encryption Standard (AES)", November 26, 2001 http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf - [10] Marcelo B. de Barcelos Design Case, "Optimized performance and area implementation of Advanced Encryption Standard in Altera Devices, by http://www.inf.ufrgs.br/~panato/artigos/ designcon02.pdf - [11] A. Menezes, P. van Oorschot, and S. Vanstone, Handbook of Applied Cryptography, CRC Press, New York, 1997, p. 81-83. - [12]. http://www.wordiq.com/definition/Satellite. - [13]. http://www.spacestationinfo.com/satellitestypes.htm . - [14]. www.buzzle.com/articles/nasa/-satellite-images.html. ### Memory-Based Realization of FIR Digital Filter using Look up Table #### G.Srinivasarao, K.venkateswarlu & Ch.Ravi Kumar Dept. of ECE, Prakasam Engineering College, Kandukur, India E-mail: gsrinivasarao443@gmail.com, kolagotlavenkat@gmail.com, ravi\_ece99@yahoo.com Abstract - FIR filters find a wide variety of signal processing applications. The demand for higher quality electronics increasing tremendously and many traditional electronics appliances now going digital, there is lot of stress on increasing the quality of FIR filters used in these products. FIR filter outputs are computed as the product of input sample vectors and filter coefficients. FIR filters mainly consist of multipliers and higher quality FIR filters need long multipliers which consume a lot of area, power and yield high delays and thus, becoming the bottleneck for performance for the devices in which these filters are used. In LUT multiplier based approach memory elements stores all values of product terms In this paper we study and analyze different architectures of implementing multipliers using Look Up Tables and their effect on the system performance like delay and area. Our first architecture is a simple implementation of a multiplier using Look up tables where as the second architecture reduces the number of memory elements required to half the original count but, adds few glue logic circuits. The third approach uses the dual port capabilities of memories to reduce the required memory size to less than 12.5% of the original size and thereby achieving lower delays while performing multiplication of large numbers. The LUT-multiplier based design of 16-tap FIR filters has been synthesized and found that the proposed LUT-multiplier-based design involves less delay and area than the conventional -based design for the same throughput and lower latency of implementation Keywords - LUT (Look Up Tables), FIR (Finite Impulse Response) DSP, VLSI. #### I. INTRODUCTION A finite impulse response (FIR) filter is a type of a signal processing filter whose impulse response (or response to any finite length input) is of finite duration, because it settles to zero in finite time. This is in contrast to infinite impulse response (IIR) filters, which have internal feedback and may continue to respond indefinitely. Finite impulse response (FIR) digital filter is widely used as a basic tool in various signal processing and image processing applications [1] such as channel equalization in digital communication, noise elimination in signal processing and adaptive noise cancellation in speech processing. Several attempts have been made and continued to develop VLSI systems for these filters [2]-[5]. Scaling in silicon devices has progressed over the last four decades; semiconductor memory has become cheaper, faster and more power-efficient. It has also been found that the transistor packing density of SRAM is not only high, but also increasing much faster than the transistor density of logic devices as shown in Figure 1. Fig. 1 : Transistor Density in Logic Elements and SRAM According to the requirement of different application environments, memory technology has been advanced in a wide and diverse manner. Radiation hardened memories for space applications, wide temperature memories for automotive, high reliability memories for biomedical instrumentation, low power memories for consumer products, and high-speed memories for multimedia applications are under continued development process to take care of the special needs. [6], [7].Memory-based structures have many other advantages like greater potential for high-throughput and reduced-latency implementation and expected to have less dynamic power consumption compared to the conventional multipliers. The main theme of the project is to present a new approach to LUT based multiplication known as odd multiple storage scheme, to reduce the LUT size over that of conventional design. Here in the look-up-table (LUT)-multiplier-based approach, where the memory elements store all the possible values of products of the filter coefficients could be an area-efficient alternative to conventional design. Several experiments have been made to reduce the memory-space in DA-based architectures using offset binary coding (OBC), and group distributed technique. A decomposition scheme is suggested in a recent paper for reducing the memorysize of DA-based implementation of FIR filter. But, it is observed that the reduction of memory-size achieved by such decompositions is accompanied by increase in latency as well as the number of adders and latches. Significant work has been done on efficient DA-based computation of filters. In this paper, we aim at presenting two new approaches for designing the LUT for LUT-multiplier-based implementation, where the memory-size is reduced to nearly half of the conventional approach. Besides, we find that instead of direct-form realization, transposed form realization of FIR filter is more efficient for the LUT-multiplier-based implementation. In the transposed form, a single segmented-memory core could be used instead of separate memory modules for individual multiplications in order to avoid the use of individual encoders for each of those separate modules. The paper consists of the following steps: In section II we have presented the basic description about the FIR filter. In section III we have presented the proposed LUT Design for memory based multiplication. In section IV we have presented the implementation of memory based structures for FIR Filter using LUT multipliers. In section V we have presented the Simulation and synthesis results. In section VI we have presented the conclusion. ### II. BASIC DESCRIPTION ABOUT THE FIR FILTER The output y of a linear time invariant system is determined by convolving its input signal x with its impulse response b. For a discrete-time FIR filter, the output is a weighted sum of the current and a finite number of previous values of the input. The operation is described by the following equation, which defines the output sequence y[n] in terms of its input sequence x[n] $$y(n) = \sum_{i=0}^{N} b_i x[n-i]$$ Where x[n] is the input signal, y[n] is the output signal, b<sub>i</sub> are the filter coefficients that make up the impulse response, N is the filter order The important properties of the fir filter are as follows - 1. Require no feedback. - Inherently stable. - Easy to design. ### III. PROPOSED LUT DESIGN FOR MEMORY BASED MULTIPLICATION The A lookup table is a data structure, usually an array or associative array, often used to replace a runtime computation with a simpler array indexing operation. The tables may be pre-calculated and stored in static program storage or calculated as part of a programs initialization phase (memorization). Such a simple application, with definite outputs for every input, is called a look-up table, because the memory device simply "looks up" what the output(s) should to be for any given combination of inputs states. The basic principle of memory-based multiplication is depicted in Fig.2 Fig. 2: Conventional Memory Based Multiplier Here A be a fixed coefficient and X be an input word to be multiplied with A and L is the word length. Here we are having $2^L$ Possible values of 'X'. If the value of L is 3 then will get total number of possible combinations are 8 that is output is the product of the input and fixed coefficient. The product-word $(A \times X_i)$ , for $0 \le X_i \le 2^L$ -1, is stored at the memory location whose address is the same as the binary value of $X_i$ , such that if L -bit binary value of $X_i$ is used as address for the memory-unit, then the corresponding product value is read-out from the memory. TABLE 1 LUT WORDS AND PRODUCT VALUES FOR INPUT WORD LENGTH L=4 | Address | word | Stored | Input | Product | # of | cor | ıtrol | |---------|--------|--------|-------------------|-------------------|--------|-----|-------| | d2 d1d0 | symbol | value | $x_3 x_2 x_1 x_0$ | value | shifts | 51 | 50 | | 000 | P0 | A | 0 0 0 1 | A | 0 | 0 | 0 | | | | | 0 0 1 0 | 2 <sup>1</sup> xA | 1 | 0 | 1 | | | | | 0 1 0 0 | $2^2xA$ | 2 | 1 | 0 | | | | | 1 0 0 0 | 2 <sup>3</sup> xA | 3 | 1 | 1 | | 001 | P1 | 3A | 0 0 1 1 | 3A | 0 | 0 | 0 | | | | | 0 1 1 0 | 21x3A | 1 | 0 | 1 | | | | | 1 1 0 0 | $2^2x3A$ | 2 | 1 | 0 | | 010 | P2 | 5A | 0 1 0 1 | 5A | 0 | 0 | 0 | | | | | 1 0 1 0 | 21x5A | 1 | 0 | 1 | | 011 | P3 | 7A | 0 1 1 1 | 7A | 0 | 0 | 0 | | | | | 1 1 1 0 | 21x7A | 1 | 0 | 1 | | 100 | P4 | 9A | 1001 | 9A | 0 | 0 | 0 | | 101 | P5 | 11A | 1011 | 11A | 0 | 0 | 0 | | 110 | P6 | 13A | 1 1 0 1 | 13A | 0 | 0 | 0 | | 111 | P7 | 15A | 1 1 1 1 | 15A | 0 | 0 | 0 | So in conventional implementation of memory based multiplication we need 2<sup>L</sup> words used as a look up tables having pre computed values corresponding to the value of 'X'. Recently we have shown that only 2<sup>L</sup>/2 words needed corresponding to the odd multiples of 'A' may only stored in the LUT [1]. Fig. 3: The Proposed LUT-based Multiplier Here we are making that one of the product word is zero and other $(2^L/2-1)$ are even multipliers of 'A' are derived by left shift operations of the corresponding odd multipliers of 'A' that will be shown in Table1. $s_0$ and $s_1$ are control bits of the logarithmic barrel shifter. Here we are using barrel-shifter for producing a maximum of (L-1) left shifts of A. By using encoder L-bit input words are mapped to (L-1) bit LUT addresses. The controlling of barrel shifter may be done by using the control circuit. ### A) PROPOSED LUT-BASED MULTIPLIER FOR 4 - BIT INPLIT The proposed LUT Based multiplier for L=4 shown in below figure. Here we are having that memory array of eight words each of (W+4) bit, 3-to-8 line address encoder, AND Cell ,barrel shifter and control circuit for generating control word ( $S_0$ , $S_1$ ) are used to control the barrel shifter and the RESET is used for the to control the AND cell. The 4-to-3 bit input encoder is shown in Fig.4 having four-bit input word ( $x_3$ $x_2$ $x_1$ $x_0$ ) and 3-bit address word ( $x_3$ $x_4$ $x_5$ $x_$ $$d_0 = \sim ((\sim(x_0 \times x_1)) \times (\sim(x_1 \times x_2)) \times (x_0 + (\sim(x_2 \times x_3)))) \quad (1)$$ $$d_1 = \sim ((\sim(x_0 \times x_2)) \times (x_0 + (\sim(x_1 \times x_3))))$$ (2) $$d_2 = x_0 \times x_3 \tag{3}$$ Pre computed odd values are stored as Pi for i=0, 1, 2, 3 ...8 consecutive locations of memory array. The main purpose of decoder is that takes the 3-bit address from the input encoder and generates 8 bit output word from select line. When input line is even value then the out put has to be shifted one location to the left. From the control circuit shown in fig5 we are having that $$s_0 = \sim (x_0 + (\sim (x_1 + (\sim x_2)))) \tag{4}$$ $$\mathbf{s}_1 = \sim (\mathbf{x}_0 + \mathbf{x}_1) \tag{5}$$ The number of shifts to be performed depends on the control circuit bits. By observing the control circuit bits and the corresponding number of shifts to be performed from table 4, we can conclude that whenever $s_0 = 0$ , zero shift is required, and $s_0 = 1$ , one left shift is required. Similarly whenever $s_1 = 0$ , zero shift is required, and $s_1 = 1$ , two left shifts are required. Hence the required number of shifts to be performed on the data is achieved by the combination of these control circuit bits which are also given to barrel shifter. Fig. 4: The 4-to-3 bits input Encoder Fig. 5: Control Circuit #### B) PROPOSED LUT-BASED MULTIPLIER FOR 8-BIT INPUT Multiplication of an 8 -bit input with a W -bit fixed coefficient can be performed through a pair of multiplications using a dual-port memory of 8 words (or two single-port memory units) along with a pair of encoders, AND cells and barrel shifters as shown in Fig6. The shift-adder performs left-shift operation of the output of the barrel-shifter corresponding to more significant half of input by four bit-locations, and adds that to the output of the other barrel-shifter. Fig. 6: Memory Based Multiplier using Dual Port Memory Array. ## IV. IMPLEMENTATION OF MEMORY BASED STRUCTURES FOR FIR FILTERS USING MULTIPLIERS It will be described that the structure of memory based realization of an N-tap filter. For N-tap filter the input and output relationship is given by $$y(n) = h(0).x(n)+h(1).x(n-1)+h(2).x(n-2)+....h(N-1).x(n-N+1)$$ ......(6) Where h (n), filter coefficient for n = 0, 1... N-1 x(n-i) is input sample for i=0, 1... N-1 y(n) is current output. It is implemented for both signed and unsigned operands. If x (n) is unsigned and filter coefficient h(n) to be signed then $$y(n)=sign(0).|h(0)|.x(n)+sign(1).|h(1)|.x(n-1)+...+sign(N-1).h|(N-1)|.x(n-N+1) ......(7)$$ Here h(n) = sign(n). |h(n)|, for n = 0, 1, ..., N-1, The above equation then may be written in a recursive form as $$y(n)=sign(0).|h(0)|.x(n)+sign(1).D(|h(1)|.x(n)+sign(2).D(|h(2)|.x(n) +...+ sign(N-1).D(|h(N-1)|.x(n))...)$$ D stands for the delay operator, such that D x(n-i) = x(n-i-1), for i = 1, 2,...,N-1. The recursive computation of FIR filter output according to above the equation is represented by a transposed form data-flow graph (DFG) shown in fig.7 Fig. 7: The DFG of recursive computation of FIR filter #### A. MEMORY-BASED FIR FILTER USING CONVENTIONAL LUT The FIR filter output is represented by a transposed form data flow graph is shown in figure7. it consists of N multiplication nodes and (N-1) add subtract nodes. Each multiplication node perform the multiplication of an input samples with filter coefficient. the add subtract node subs tracts its input from top with that of its input from the left when corresponding to filter coefficients positive or negative. A fully pipelined structure of N-tap FIR filter for input word L=8is shown in figure 8. The figure shown in below consists of N memory units and (N-1) add subtract cells and delay registers. In each cycle all the 8-bits of current input samples are fed to all the LUT multipliers with a pair of two addresses X1, X2. The main purpose of shift add cell is that shifts it's right input to left by four bit locations and add the shifted value with it's other input to produce output. The output of multiplier is fed to add subtract cells in parallel and each performs the same operation. Fig. 8 : Conventional LUT Multiplier Based Structure of an N tap Transposed ### B. MEMORY-BASED FIR FILTER USING PROPOSED LUT DESIGN The main difference between the conventional LUT multiplier and proposed multiplier is that Replaced by proposed odd multiples storage LUT, all the multiplications are implemented by a single memory module; hardware complexity of decoder circuit can be eliminated The proposed module consists of single memory module and array of N shift add cells, (N-1) add subtract cells and delay registers Fig. 9 : The Dual Port Segmented Memory Core for the N<sup>th</sup> order FIR Filter From fig 9 consists of dual port memory core which consists of [8\*(W+4)]\*N array of bit level memory elements arranged in 8 rows of (W+4) N bit width. Each row consisting of N segments where each segment is (W+4) bit wide. during each cycle a pair of 4-bit sub words X1 and X2 are derived from input samples x(n) and fed through the 4 to 3 bit encoder and control circuit which produces two set of words ,select signals and a pair of control signal and two pair of RESET signals. All signals are fed to dual port memory and produces N pair of outputs and those are fed to N pair of barrel shifter through AND cells then we will get the required output from FIR filter. #### V. SIMULATION AND SYNTHESIS RESULTS Simulation is to verify your design. Thus it is first step after your design and coding is done. It is totally software activity where you verify your design using simulators like Model Sim. This step is also called as functional simulation. In other way Simulation is nothing but whatever expected logical functionality checking in Hardware world, without considering the actual timing issues i.e. net delays and circuit delays. The simulation results and the net list simulation are verified for each module. To simulate the design, Model simulator is used. The simulation results for each block are shown below along with the signal description. The resultant variation between the conventional method and proposed method is shown in the below tabular form. TABLE 2 COMPARISION BETWEEN TWO METHODES | COMPONENT | CONVENTIONAL<br>LUT | PROPOSE<br>D LUT | |---------------------------|---------------------|------------------| | Number of slices | 1894 | 1444 | | Number of flip-<br>flops' | 1335 | 999 | | Number of LUT | 3232 | 2510 | #### VI. CONCLUSION These approaches LUT-based-multiplication are suggested to reduce the LUT-size over that of conventional design. By odd-multiple-storage scheme, for address-length 4, the LUT size is reduced to half by using a barrel-shifter and (W + 4) number of AND gates, where W is the word-length of the fixed multiplying coefficients. In both the methods the throughput is same. However the proposed LUT-multiplier-based design involves half the memory than the conventional LUT-based designs. The LUT-multiplier-based design of FIR filter therefore could be more efficient than the conventional approach in terms of area-complexity for a given throughput and lower latency of implementation. The LUT-multipliers could be used implementation of cyclic and linear convolutions, sinusoidal transforms, and inner-product computation. #### SIMULATION AND SYNTHESIS RESULTS OF CONVENTIONAL LUT AND PROPOSED LUT FIR FILTERS Fig. 10: Simulation Results of conventional LUT FIR Filter Fig. 11: Synthesis Report of conventional LUT FIR Filter Fig. 12: Simulation Results of proposed LUT FIR Filter Fig. 13: Synthesis Report of proposed LUT FIR Filter #### REFERENCES - P. K. Meher, "New approach to LUT implementation and accumulation for memorybased multiplication," in Proc. 2009 IEEE Int. Symp.Circuits Syst., ISCAS'09, May 2009, pp. 453–456. - [2] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. New York: Wiley, 1999. - [3] H. H. Kha, H. D. Tuan, B.-N. Vo, and T. Q. Nguyen, "Symmetric orthogonal complex-valued filter bank design by semidefinite programming," IEEE Trans. Signal Process., vol. 55, no. 9, pp. 4405–4414, Sep. 2007. - [4] H. H. Dam, A. Cantoni, K. L. Teo, and S. Nordholm, "FIR variable digitalfilter with signed power-of-two coefficients," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 6, pp. 1348– 1357, Jun. 2007. - [5] R. Mahesh and A. P. Vinod, "A new common subexpression elimination algorithm for realizing low-complexity higher order digital filters," IEEE Trans. Computer-Aided Ded. Integr. Circuits Syst., vol. 27, no.2, pp. 217–229, Feb. 2008. - [6] B. Prince, "Trends in scaled and nanotechnology memories," in Proc. IEEE Conf. Custom Integr. Circuits, Nov. 2005. - [7] K. Itoh, S. Kimura, and T. Sakata, "VLSI memory technology: Current status and future trends," in Proc. 25th Eur. Solid-State Circuits Conference, (ESSCIRC'99, Sep. 1999, pp. 3–10. - [8] J. G. Proakis and D. G. Manolakis, Digital Signal Processing: Principles, Algorithms and Applications. Upper Saddle River, NJ: Prentice-Hall, 1996.