Distributed Computing in Sensor Systems: 5th IEEE International Conference, DCOSS 2009, Marina del Rey, CA, USA, June 8-10, 2009, Proceedings (Lecture ... Networks and Telecommunications)

42 5 2
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Distributed Computing in Sensor Systems: 5th IEEE International Conference, DCOSS 2009, Marina del Rey, CA, USA, June 8-10, 2009, Proceedings (Lecture ... Networks and Telecommunications)

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris

934 44 8MB

Pages 384 Page size 430 x 660 pts Year 2011

Report DMCA / Copyright

DOWNLOAD FILE

Recommend Papers

Engineering Psychology and Cognitive Ergonomics: 8th International Conference, EPCE 2009, Held as Part of HCI International 2009, San Diego, CA, USA, ... Lecture Notes in Artificial Intelligence)

1,175 258 20MB Read more

Utility Computing: 15th IFIP IEEE International Workshop on Distributed Systems: Operations and Management, DSOM 2004, Davis, CA, USA, November 15-17, ... (Lecture Notes in Computer Science)

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris

160 2 9MB Read more

Harvard Business Review - June 2009

1,179 288 26MB Read more

IEEE International Workshop on Distributed Systems: Operations and Management,

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen 2506 3 Berlin Heidelberg New Y

304 50 2MB Read more

Distributed Computing: Principles, Algorithms, and Systems

This page intentionally left blank Distributed Computing Principles, Algorithms, and Systems Distributed computing de

2,919 233 7MB Read more

Approximation Algorithms for Complex Systems: Proceedings of the 6th International Conference on Algorithms for Approximation, Ambleside, UK, 31st August ... 2009 (Springer Proceedings in Mathematics)

Springer Proceedings in Mathematics Volume 3 For other titles in this series go to www.springer.com/series/8806 Sprin

303 98 6MB Read more

Distributed Computing: Principles, Algorithms, and Systems

This page intentionally left blank Distributed Computing Principles, Algorithms, and Systems Distributed computing de

1,291 223 7MB Read more

Gypsy Morph [Del Rey]

347 141 521KB Read more

Distributed Computing IWDC 2005: 7th International Workshop, Kharagpur, India, December 27-30, 2005, Proceedings

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris

777 105 10MB Read more

Business Process Management: International Conference, BPM 2003, Eindhoven, The Netherlands, June 26-27, 2003, Proceedings

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen 2678 3 Berlin Heidelberg New Y

938 605 5MB Read more

File loading please wait...

Citation preview

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany

5516

Bhaskar Krishnamachari Subhash Suri Wendi Heinzelman Urbashi Mitra (Eds.)

Distributed Computing in Sensor Systems 5th IEEE International Conference, DCOSS 2009 Marina del Rey, CA, USA, June 8-10, 2009 Proceedings

13

Volume Editors Bhaskar Krishnamachari Urbashi Mitra University of Southern California Ming Hsieh Department of Electrical Engineering 3740 McClintock Avenue, Los Angeles, CA 90089, USA E-mail: {bkrishna, ubli}@usc.edu Subhash Suri University of California, Department of Computer Science Santa Barbara, CA 93106, USA E-mail: [email protected] Wendi Heinzelman University of Rochester Department of Electrical and Computer Engineering Hopeman 307, Rochester, NY 14627, USA E-mail: [email protected]

Library of Congress Control Number: Applied for CR Subject Classification (1998): C.2, H.3.4, I.2.9, C.3, D.4.7 LNCS Sublibrary: SL 5 – Computer Communication Networks and Telecommunications ISSN ISBN-10 ISBN-13

0302-9743 3-642-02084-4 Springer Berlin Heidelberg New York 978-3-642-02084-1 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2009 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12689422 06/3180 543210

Message from the General Co-chairs and Vice General Co-chair

We are pleased to present the proceedings of DCOSS 2009, the IEEE International Conference on Distributed Computing in Sensor Systems, the fourth event in this annual conference series. The DCOSS meeting series covers the key aspects of distributed computing in sensor systems, such as high-level abstractions, computational models, systematic design methodologies, algorithms, tools and applications. This meeting would not be possible without the tireless eﬀorts of many volunteers. We are indebted to the DCOSS 2009 Program Chair, Bhaskar Krishnamachari, for overseeing the review process, composing the technical program and making the local arrangements. We appreciate his leadership in putting together a strong and diverse Program Committee covering various aspects of this multidisciplinary research area. We would like to thank the Program Committee Vice Chairs, Subhash Suri, Wendi Heinzelman, and Urbashi Mitra, as well as the members of the Program Committee (PC), the external referees consulted by the PC, and all of the authors who submitted their work to DCOSS 2009. We also wish to thank the keynote speakers for their participation in the meeting. Special thanks go to Anita La Salle, NSF, for organizing a session on funding opportunities relevant to the broader community. Several volunteers contributed signiﬁcantly to the realization of the meeting. We wish to thank the organizers of the workshops collocated with DCOSS 2009 as well as the DCoSS Workshops Chair, Sotiris Nikoletseas, for coordinating workshop activities. We would like to thank Stefano Basagni and Srdjan Krco for their eﬀorts in organizing the Poster Session and Demo Session respectively. Special thanks goes to Qun Li for organizing the Work-in-Progress Session. Special thanks also goes to Neal Patwari for handling conference publicity, Animesh Pathak for maintaining the website, and to Zachary Baker for his assistance in putting together this proceedings volume. Many thanks also go to Germaine Gusthiot for handling the conference ﬁnances. We would like to especially thank Jose Rolim, DCOSS Steering Committee Chair, for inviting us to be the General Chairs. His invaluable input in shaping this conference series and his timely contributions in resolving meeting related issues are gratefully acknowledged. Finally, we would like to acknowledge the sponsors of DCOSS 2009. Their contributions are a key enabler of a successful conference. The research area of sensor networks is rapidly evolving, inﬂuenced by fascinating advances in supporting technologies. We sincerely hope that this conference series will serve as a forum for researchers working in diﬀerent, complementary aspects of this multidisciplinary ﬁeld to exchange ideas and interact and cross-fertilize research

VI

Preface

in the algorithmic and foundational aspects, high-level approaches as well as more applied and technology-related issues regarding tools and applications of wireless sensor networks. June 2009

Jie Wu Viktor K. Prasanna Ivan Stojmenovic

Message from the Program Chair

This proceedings volume includes the accepted papers of the 5th International Conference on Distributed Computing in Sensor Systems. This year we introduced some changes in the composition of the three tracks to increase crossdisciplinary interactions. The Algorithms track was enhanced to include topics pertaining to performance analysis and network optimization and renamed “Algorithms and Analysis.” The Systems and Applications tracks, previously separate, were combined into a single track. And a new track was introduced on ”Signal Processing and Information Theory.” DCOSS 2009 received 116 submissions for the three tracks. After a thorough review process, in which at least three reviews were solicited for all papers, a total of 26 papers were accepted. The research contributions in this proceedings span many aspects of sensor systems, including energy-eﬃcient mechanisms, tracking and surveillance, activity recognition, simulation, query optimization, network coding, localization, application development, data and code dissemination. Based on the reviews, we also identiﬁed the best paper from each track, which are as follows: Best paper in the Algorithms and Analysis track: “Eﬃcient Sensor Placement for Surveillance Problems” by Pankaj Agarwal, Esther Ezra and Shashidhara Ganjugunte. Best paper in the Applications and Systems track: “Optimal Allocation of Time-Resources for Multihypothesis Activity-Level Detection,” by Gautam Thatte, Viktor Rozgic, Ming Li, Sabyasachi Ghosh, Urbashi Mitra, Shri Narayanan, Murali Annavaram and Donna Spruijt-Metz. Best paper in the Signal Processing and Information Theory track: “Distributed Computation of Likelihood Maps for Target Tracking” by Jonathan Gallagher, Randolph Moses and Emre Ertin. I would like to oﬀer sincere thanks to the three Program Vice Chairs, Subhash Suri (Algorithms and Analysis), Wendi Heinzelman (Applications and Systems), and Urbashi Mitra (Signal Processing and Information Theory) for recruiting an amazing collection of ﬁrst-rate researchers to the Technical Program Committee (TPC) and for working hard to lead the review process for each track. My heartfelt thanks also to all TPC members and the external reviewers who worked with them. I wish to thank the Steering Committee Chair Jose Rolim, the DCOSS 2009 General Chairs Jie Wu and Viktor Prasanna, and Vice General Chair Ivan Stojmenovic for their trust and assistance. I would also very much like to thank the Publicity Chair, Neal Patwari, and the Web Chair, Animesh Pathak, for their invaluable help with publicity, and the Proceedings Chair, Zachary Baker, for his tireless eﬀorts in preparing these proceedings. June 2009

Bhaskar Krishnamachari

Organization

General Chair Viktor K. Prasanna Jie Wu

University of Southern California, USA Florida Atlantic University, USA

Vice General Chair Ivan Stojmenovic

University of Ottawa, Canada

Program Chair Bhaskar Krishnamachari

University of Southern California, USA

Program Vice Chairs Algorithms and Performance Analysis Subhash Suri University of California at Santa Barbara, USA Applications and Systems Wendi Heinzelman

University of Rochester, USA

Signal Processing and Information Theory Urbashi Mitra

University of Southern California, USA

Steering Committee Chair Jose Rolim

University of Geneva, Switzerland

Steering Committee Sajal Das Josep Diaz Deborah Estrin Phillip B. Gibbons Sotiris Nikoletseas Christos Papadimitriou Kris Pister Viktor Prasanna

University of Texas at Arlington, USA UPC Barcelona, Spain University of California, Los Angeles, USA Intel Research, Pittsburgh, USA University of Patras and CTI, Greece University of California, Berkeley, USA University of California, Berkeley, and Dust, Inc.,USA University of Southern California, Los Angeles, USA

X

Organization

Poster Chair Stefano Basagni

Northeastern University, USA

Workshops Chair Sotiris Nikoletseas

University of Patras and CTI, Greece

Proceedings Chair Zachary Baker

Los Alamos National Lab, USA

Publicity Chair Neal Patwari

University of Utah, USA

Web Publicity Chair Animesh Pathak

University of Southern California, USA

Finance Chair Germaine Gusthiot

University of Geneva, Switzerland

Work in Progress Chair Qun Li

College of William and Mary, USA

Demo Chair Srdjan Krco

Ericsson, Ireland

Sponsoring Organizations IEEE Computer Society Technical Committee on Parallel Processing (TCPP) IEEE Computer Society Technical Committee on Distributed Processing (TCDP)

Held in Cooperation with ACM Special Interest Group on Computer Architecture (SIGARCH) ACM Special Interest Group on Embedded Systems (SIGBED) European Association for Theoretical Computer Science (EATCS) IFIP WG 10.3

Organization

Program Committee Algorithms Pankaj Agarwal Jim Aspnes Jit Bose Chiranjeeb Buragohain Carla Chiasserini Sandor Fekete Jie Gao Martin Haenggi Ravi Kannan Samir Khuller Steve LaValle Mingyan Liu Paolo Santi Sanjay Shakkottai Peter Widmayer Michele Zorzi Gil Zussman

Duke University, USA Yale University, USA Carleton University, Canada Amazon, USA Politecnico di Torino, Italy Braunschweig, Germany Stony Brook, USA University of Notre Dame, USA Microsoft, India University of Maryland, USA UIUC, USA University of Michigan, USA University of Pisa, Italy University of Texas at Austin, USA ETH, Switzerland University of Padova, Italy Columbia University, USA

Applications and Systems Tarek Abdelzaher Nael Abu-Ghazelah Stefano Basagni Nirupama Bulusu Tracy Camp Andrew Campbell Mihaela Cardei Eylem Ekici Salil Kanhere Holger Karl Dilip Krishnaswamy Cecilia Mascolo Chiara Petrioli Joe Polastre Andreas Savvides Curt Schurgers Rajeev Shorey Krishna Sivalingam Violet Syrotiuk Sameer Tilak Cedric Westphal Ying Zhang

UIUC, USA Binghamton University, USA Northwestern University, USA Portland State University, USA Colorado School of Mines, USA Dartmouth College, USA Floria Atlantic University, USA Ohio State University, USA University of New South Wales, Australia University of Paderborn, Germany Qualcomm, USA Cambridge University, England University of Rome “La Sapienza”, Italy Sentilla, USA Yale University, USA UCSD, USA GM Research, India UMBC, USA Arizona State University, USA San Diego Supercomputer, UCSD, USA DoCoMo Labs, USA Xerox PARC, USA

XI

XII

Organization

Signal Processing and Information Theory Jean Francois Chamberland Alex Dimakis Stark Draper Emre Ertin John Fisher Massimo Franchescetti Michael Gastpar Dennis Goeckel Prakash Ishwar Tom Luo Ivana Maric Robert Nowak Antonio Ortega Ashu Sabharwal Anna Scaglione Aaron Wagner Qing Zhao

Texas A&M University, USA California Institute of Technology, USA University of Wisconsin, Madison, USA Ohio State University, USA Massachusetts Institute of Technology, USA University of California, San Diego, USA University of California, Berkeley, USA University of Massachusetts, Amherst, USA Boston University, USA University of Minnesota, USA Stanford University, USA University of Wisconsin, Madison, USA University of Southern California, USA Rice University, USA University of California, Davis, USA Cornell University, USA University of California, Davis, USA

Additional Reviewers Shuchin Aeron Ebad Ahmed Michele Albano Ahmed Badi Suman Banerjee Burchan Bayazit Lorenzo Bergamini Piotr Berman Jeﬀ Boleng Nicola Bui Paolo Casari Riccardo Crepaldi Bastian Degener Sarang Deshpande Vida Dujmovic Alon Efrat Lars Erickson Seth Gilbert Leana Golubchik Navin Goyal Chris Gray Bo Han Bracha Hod Bastian Katz

Yoo-Ah Kim Athanasios Kinalis Arun Kumar Stuart Kurkowski Nicholas D. Lane Sol Lederer Marco Levorato Feng Li Jian Li Hong Lu Nan Ma Chris Merlin Emiliano Miluzzo David Mount Rafael Murrieta Gireesan Namboothiri Michele Nati Calvin Newport Aris Papadopoulos Svetlana Radosavac Balaji Raghavachari Utz Roedig Jose Rolim Michele Rossi

Rik Sarkar Luca Schenato Christiane Schmidt Yuan Shen Hanan Shpungin Sreekanth Sreekumaran Jukka Suomela Kamilah Taylor Benjamin Tovar Anitha Varghese Andrea Vitaletti Ye Wang Yawen Wei Zixiang Xiong Yinying Yang Shuhui Yang Yuan Yao Jingjin Yu Dengpan Zhou Xianjin Zhu Artur Ziviani Francesco Zorzi

Table of Contents

Speed Dating Despite Jammers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dominic Meier, Yvonne Anne Pignolet, Stefan Schmid, and Roger Wattenhofer

1

Fast Self-stabilization for Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jacob Beal, Jonathan Bachrach, Dan Vickery, and Mark Tobenkin

15

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hosam Rowaihy, Matthew P. Johnson, Diego Pizzocaro, Amotz Bar-Noy, Lance Kaplan, Thomas La Porta, and Alun Preece Minimum Variance Energy Allocation for a Solar-Powered Sensor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dong Kun Noh, Lili Wang, Yong Yang, Hieu Khac Le, and Tarek Abdelzaher Optimal Rate Allocation of Compressed Data Streams in Multihop Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun Lung Lin and Jia Shung Wang Mote-Based Online Anomaly Detection Using Echo State Networks . . . . Marcus Chang, Andreas Terzis, and Philippe Bonnet Adaptive In-Network Processing for Bandwidth and Energy Constrained Mission-Oriented Multi-hop Wireless Networks . . . . . . . . . . . Sharanya Eswaran, Matthew Johnson, Archan Misra, and Thomas La Porta

28

44

58 72

87

LazySync: A New Synchronization Scheme for Distributed Simulation of Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhong-Yi Jin and Rajesh Gupta

103

Similarity Based Optimization for Multiple Query Processing in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hui Ling and Taieb Znati

117

Finding Symbolic Bug Patterns in Sensor Networks . . . . . . . . . . . . . . . . . . Mohammad Maiﬁ Hasan Khan, Tarek Abdelzaher, Jiawei Han, and Hossein Ahmadi Distributed Continuous Action Recognition Using a Hidden Markov Model in Body Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eric Guenterberg, Hassan Ghasemzadeh, Vitali Loseu, and Roozbeh Jafari

131

145

XIV

Table of Contents

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anthony D. Wood and John A. Stankovic

159

Compressed RF Tomography for Wireless Sensor Networks: Centralized and Decentralized Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad A. Kanso and Michael G. Rabbat

173

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements . . . . Suhinthan Maheswararajah, Siddeswara Mayura Guru, Yanfeng Shu, and Saman Halgamuge

187

Route in Mobile WSN and Get Self-deployment for Free . . . . . . . . . . . . . . K´evin Huguenin, Anne-Marie Kermarrec, and Eric Fleury

201

A Sensor Network System for Measuring Traﬃc in Short-Term Construction Work Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manohar Bathula, Mehrdad Ramezanali, Ishu Pradhan, Nilesh Patel, Joe Gotschall, and Nigamanth Sridhar Empirical Evaluation of Wireless Underground-to-Underground Communication in Wireless Underground Sensor Networks . . . . . . . . . . . . Agnelo R. Silva and Mehmet C. Vuran Cheap or Flexible Sensor Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amotz Bar-Noy, Theodore Brown, Matthew P. Johnson, and Ou Liu MCP: An Energy-Eﬃcient Code Distribution Protocol for Multi-Application WSNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weijia Li, Youtao Zhang, and Bruce Childers Optimal Allocation of Time-Resources for Multihypothesis Activity-Level Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gautam Thatte, Viktor Rozgic, Ming Li, Sabyasachi Ghosh, Urbashi Mitra, Shri Narayanan, Murali Annavaram, and Donna Spruijt-Metz

216

231 245

259

273

Distributed Computation of Likelihood Maps for Target Tracking . . . . . . Jonathan Gallagher, Randolph Moses, and Emre Ertin

287

Eﬃcient Sensor Placement for Surveillance Problems . . . . . . . . . . . . . . . . . Pankaj K. Agarwal, Esther Ezra, and Shashidhara K. Ganjugunte

301

Local Construction of Spanners in the 3-D Space . . . . . . . . . . . . . . . . . . . . . Iyad A. Kanj, Ge Xia, and Fenghui Zhang

315

Combining Positioning and Communication Using UWB Transceivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul Alcock, Utz Roedig, and Mike Hazas

329

Table of Contents

XV

Distributed Generation of a Family of Connected Dominating Sets in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kamrul Islam, Selim G. Akl, and Henk Meijer

343

Performance of Bulk Data Dissemination in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Dong, Chun Chen, Xue Liu, Jiajun Bu, and Yunhao Liu

356

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

371

Speed Dating Despite Jammers Dominic Meier1 , Yvonne Anne Pignolet1 , Stefan Schmid2 , and Roger Wattenhofer1 1 Computer Engineering and Networks Laboratory, ETH Zurich, Switzerland [email protected], [email protected], [email protected] 2 Chair for Efficient Algorithms, Technical University of Munich, Germany [email protected]

Abstract. Many wireless standards and protocols today, such as WLAN and Bluetooth, operate on similar frequency bands. While this permits an efficient usage of the limited medium capacity, transmissions of nodes running different protocols can interfere. This paper studies how to design node discovery algorithms for wireless multichannel networks which are robust against contending protocols on the shared medium. We pursue a conservative approach and consider a Byzantine adversary who prevents the communication of our protocol on t channels in a worst-case fashion. Our model also captures disruptions controlled by an adversarial jammer. This paper presents algorithms for scenarios where t is not known. The analytical findings are complemented by simulations providing evidence that the proposed protocols perform well in practice.

1 Introduction Wireless networks are ubiquitous and have become indispensable for many tasks of our daily lives. Due to the limited range of frequencies available for communication between wireless nodes such as laptops, PDAs or mobile phones, many wireless standards and protocols today operate on the same frequency bands, e.g., the ISM bands. One well-known and widely discussed example is WLAN and Bluetooth (i.e., IEEE 802.15.2), but there are many others. Such contending access of different protocols to the shared medium leads to collisions. While ultra wide band technology may mitigate this problem and reduce interference, it is not always available or desirable. This raises the question of how to devise protocols which are robust against transmissions of other protocols by design. In this paper, we seek to shed light onto this question. We adopt a conservative approach and assume that a Byzantine adversary can disturb our algorithms in an arbitrary manner. This model comprises scenarios where an adversarial jammer seeks to slow down communication or even to stop it completely. Such jamming attacks are a particularly cumbersome problem today: typically, a jamming attack does not require any special hardware and is hence simple and cheap. This paper focuses on networks without a fixed infrastructure, such as MANETs or sensor networks, which are organized in an ad hoc manner. A fundamental operation in dynamic ad hoc networks is the search of potential communication partners. In some sense, this operation is more difficult than other communication tasks, as the nodes do not have any information about each other a priori. Besides the lack of information on either hardware or medium access addresses of other nodes, concurrent transmissions— either of other nodes running the same protocol or other radio transmissions—lead to B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 1–14, 2009. c Springer-Verlag Berlin Heidelberg 2009

2

D. Meier et al.

collisions and interference. In addition, by injecting a high level of noise, a jammer can slow down wireless communication significantly. Once two nodes have met, they may agree on certain communication or channel-hopping patterns (e.g., based on their medium access addresses) facilitating efficient interactions in the future. Thus, it is of utmost importance to solve this task as fast as possible. A well-known existing protocol dealing with this problem is Bluetooth. It specifies an asymmetric way to connect and exchange information between devices such as mobile phones, cameras, GPS receivers or video game consoles. As a consequence, Bluetooth can be used to synchronize two devices as soon as they are within each other’s transmission range, or to display the availability of a printer. Clearly, the device discovery time is highly relevant in these situations. We study the problem of discovering communication partners in multi-channel networks, despite the presence of a Byzantine adversary. Concretely, we assume that the adversary corrupts t out of m available channels. We say that two nodes have successfully discovered each other if and only if two nodes are on the same channel, one transmitting, one receiving, there is no other node transmitting on this channel, and the channel is not jammed. In reality, nodes typically do not know whether, and how many, channels are corrupted. The goal of this paper is to devise algorithms solving the discovery problem efficiently without knowledge of t. We require that nodes are discovered very fast if t is small, and that the performance of the discovery algorithm degrades gracefully with increasing t. In other words, we want algorithms (oblivious to t) being competitive to a discovery algorithm knowing t. Our main contribution are fast discovery algorithms performing well without knowledge of t and despite Byzantine disruptions. In particular, we describe a randomized algorithm which, in expectation, is at most a factor of O(log2 m) slower than the best algorithm knowing t, for any t. We prove this to be optimal in the sense that this is the best ratio an algorithms that can be described by a probability distribution over the available channels can achieve. In addition, we study a scenario where the jammer chooses t according to a probability density function (PDF) which is known to the device discovery algorithm. Furthermore, our paper discusses how to extend our results to a multiplayer setting. In order to complement our formal analysis, we investigate the performance of our algorithms by in silico experiments.

2 Related Work With the increasing popularity of wireless networks such as WLANs or sensor networks, security aspects and quality of service become more relevant. An important reason for disruptions are transmissions of other devices using different protocols. One widely studied example is WLAN and Bluetooth. In this case, several possible solutions have been discussed by the IEEE task force 802.15.2 [12], e.g., a non-cooperative coexistence mechanism based on adaptive frequency hopping. The model we study is quite general and comprises many types of disruptions, such as interference [21] or jamming attacks. Resilience to jamming is crucial as jamming attacks can often be performed at low costs as there is no need for special hardware [4]. For these reasons, the jamming problem in wireless networks is intensively discussed both in practice and in theory ([8,16,18,22,23]). While some researchers focus on how such attacks can be

Speed Dating Despite Jammers

3

performed [24], others concentrate on countermeasures [1]. In [17], it has been shown that using methods based on signal strength and carrier sensing, detecting sophisticated jammers is difficult. Moreover, a method based on packet delivery ratios cannot decide unambiguously whether link problems are due to mobility, congestion or jamming. The threat of jamming attacks can be mitigated by appropriate physical layer technologies. E.g., spread spectrum techniques can be used, rendering it more difficult to detect the start of a packet fast enough in order to jam it. Unfortunately, one of the most widely deployed protocols, 802.11, has only small spreading factors [4]. In fact, it has recently been shown that the MAC protocol of 802.11 can be attacked by simple and oblivious jammers [5]. Many research projects deal with jammers on the MAC layer. For instance in [7], a coding scheme for fast frequency hopping is presented. If the adversary does not know the hopping sequence it can disturb only a subset of transmissions due to energy constraints. Alternative solutions include channel surfing and spatial retreat [24], or mechanisms to hide messages [22]. The jamming problem also raises interesting algorithmic questions. Gilbert et al. [11] investigate the efficiency of an adversary. They find that even the uncertainty introduced by the possibility of adversarial broadcasts is sufficient to slow down many protocols. In [13] a model where the adversary has a limited energy budget is considered; the paper studies how to achieve global broadcasts if the adversary is allowed to spoof addresses. In [19], fault-tolerant broadcasting under probabilistic failures is studied. [9] presents tight bounds for the running time of the -gossip problem on multi-channel networks. In [10], Dolev et al. describe a randomized protocol that allows nodes to exchange authenticated messages despite a malicious adversary that can cause collisions and spoof messages. Awerbuch et al. [4] present a MAC protocol for single-hop networks that is provably robust to adaptive adversarial jamming. The jammer can block a (1 − )fraction of the time steps, but it has to make decisions before knowing the actions of the nodes for this step. Several algorithms are presented which, e.g., allow to elect a leader in an energy efficient manner. In contrast to the work discussed above, we focus on the bootstrap problem where a node has to find other nodes in its range. This device discovery problem has been intensively studied in literature. In [6], randomized backoff protocols are proposed for a single broadcast channel. However, their algorithms are not directly applicable in wireless networks where unlike in traditional broadcast systems such as the Ethernet, collisions may not be detectable. In [15], probabilistic protocols for Bluetooth node discovery are investigated, where the nodes seek to establish one-to-one connections. In [2] and [3], protocols for single and multi channel ad hoc networks are described. However, none of these papers attend to (adversarial) disruptions.

3 Model Suppose we are given a shared medium consisting of m channels c1 , ..., cm . There may be an adversary with access to the medium. We adopt a worst-case perspective assuming that an adversary always blocks those t < m channels which minimize the discovery time of a given algorithm. We aim at devising discovery protocols that are efficient despite these circumstances. Typically, the number of jammed channels t is not known to the discovery algorithm. Consequently, our main objective is to devise algorithms

4

D. Meier et al.

which are optimal with respect to all t. In other words, an algorithm ALG should solve the node discovery problem efficiently if t is small, and “degrade gracefully” for larger t. For the analysis of the algorithms we investigate a slotted model where time is divided into synchronized time slots. However, note that all our results hold up to a factor of two in unslotted scenarios as well, due to the standard trick introduced in [20] for the study of slotted vs. unslotted ALOHA. In each time slot, every node can choose one channel and decide whether it wants to listen on this channel or to transmit some information (e.g., its ID or a seed for its hopping pattern sequence) on the channel. We say that two nodes v1 and v2 have discovered each other successfully if and only if the three following conditions are met: 1. v1 and v2 are on the same channel c 2. v1 is in listening mode and v2 transmits its contact information on c, or vice versa 3. channel c is not jammed Since nodes cannot know, whether there are other nodes in their transmission area, we count the number of time slots until a successful discovery from the point in time when all of them are around (discovery time). In this paper, we mainly constrain ourselves to the two node case. The node discovery problem turns out to be difficult to solve if we restrict ourselves to deterministic algorithms. In a scenario where all nodes are identical and do not have anything (e.g., IDs) to break the symmetry, two problems arise even in the absence of a jammer: (1) if two nodes follow a deterministic hopping pattern, they may never be on the same channel in the same slot; (2) even if the nodes meet, choosing deterministically whether to send or listen for announcements in this slot may always yield situations where both nodes send or both nodes listen. One way to break the symmetry is to allow nodes to generate random bits. Alternatively, one may assume that the two nodes which want to discover each other already share a certain number of bits which are unknown to the jammer. Due to these problems, we focus on randomized algorithms. We assume that every node runs the same algorithm, only decisions based on random experiments differ. We investigate the class of randomized algorithms that can be described by a probability distribution over the channels, i.e., in each round, a channel is selected for communication according to a probability distribution. We strive to find algorithms that perform well for every possible number of jammed channels. To this end, we define a measure that captures the loss of discovery speed due to the lack of knowledge of the number of channels the adversary decides to jam. t Definition 1 (Competitiveness). In a setting with t jammed channels, let TREF be the expected discovery time until two nodes discover each other for an optimal randomized t algorithm REF which has complete knowledge of t. Let TALG be the corresponding expected discovery time of a given algorithm ALG. We define the

competitive ratio ρ :=

t TALG . t 0≤t≤m−1 TREF

max

The smaller the achieved competitive ratio ρ, the more efficient the discovery algorithm.

Speed Dating Despite Jammers

5

4 Algorithms for Device Discovery To initiate our analysis, we first consider device discovery algorithms for the case where the total number of jammed channels t is known. Subsequently, our main results are presented together with several optimal algorithms. 4.1 Known t In our model, a node has to select a channel c and decide whether to send or listen on c in each round. Let us determine the best strategy if t is known. As we will compare our algorithms which do not know t to this strategy, we will call this reference point algorithm REF . For two nodes which have never communicated before, it is best to send or listen with probability 0.5. The following lemma derives the optimal distribution over the channels. Lemma 1. Let m denote the total number of channels and assume t, the number of jammed channels, is known. If t = 0 the best strategy is to use one designated channel for discovery. If 0 < t ≤ m/2 then the expected discovery time is minimized for an algorithm REF choosing one of the first 2t channels uniformly at random. In all other cases, the best strategy for REF is to chose each channel with probability 1/m. Thus, REF has a expected discovery time of 2 if t = 0, 8t if t ≤ m/2, 2m2 /(m − t) if t > m/2. Proof. Let pi denote REF ’s probability of choosing channel ci . Without loss of generality, assume that the channels are ordered with decreasing probability of REF , i.e., 1 ≥ p1 ≥ p2 ≥ ... ≥ pm ≥ 0. Let λ be the smallest i for which pi = 0, in other words, REF uses λ channels for discovery. Clearly, if λ < t + 1, the expected discovery time is infinite, and hence, we can concentrate on algorithms for which λ ≥ t + 1. According to our worst-case model, the jammer blocks the channels c1 , . . . , ct . λ It holds that i=t+1 pi ≤ λ−t λ , where (λ − t)/λ is equal to the sum of the channel probabilities of the channels ct+1 , . . . , cλ when the probability distribution over the first λ channels is uniform. That is, by cutting some probability from those channels with probability greater than 1/λ and distribute it over the other channels, the total probability of success will increase. Therefore, the expected discovery time is minimized for uniform probability distributions. As soon as λ p = λ−t i i=t+1 λ , we cannot further redistribute the probabilities without decreasing the overall probability of success since the jammer always blocks the t most probable channels. It remains to show that λ = min(2t, m) maximizes the probability of success. As the first t channels are jammed and the probability to be chosen is pi = 1/λ for each channel, the probability for a successful meeting is λ P[success|t] = 12 i=t+1 λ12 = λ−t 2λ2 . This probability is maximized for λ = 2t. If fewer channels are available, i.e., 2t > m, the best decision is to pick any of the m channels with probability 1/m.

6

D. Meier et al.

Since the execution in one time slot is independent from the execution in all other time slots, the expected discovery time is then given by the inverse of the success probability. 4.2 Uniform Algorithm The simplest randomized algorithm chooses one of the available m channels uniformly at random in each round. The expected discovery time of this algorithm U N I is 2m2 /(m − t). Hence the competitiveness of U N I is ρUN I = m, reached when t = 0. In other words, if there are no blocked channels, the performance of this algorithm is poor. 4.3 Class Algorithms Since we aim at being competitive to REF for any number of jammed channels t, we examine more sophisticated algorithms. Observe that for small t, selecting a channel out of a small subset of channels is beneficial, since this increases the probability that another node is using the same channel. On the other hand, for large t, using few channels is harmful, as most of them are jammed. One intuitive way to tackle the device discovery problem is to use a small number of estimators for t. In each round, we choose one of the estimators according to a probability distribution and then apply the optimal algorithm for this “known” tˆ, namely algorithm REF . In the following, we will refer to the set of channels for such a tˆ, i.e., channels c1 , ..., c2tˆ, as a class of channels. Note that any such algorithm has to include the class tˆ = m/2, otherwise the expected discovery time is infinity. We investigate the optimal number of classes for the family of algorithms selecting the estimator for the next round uniformly at random among k guess classes t1 ≤ ... ≤ ti ≤ ... ≤ tk for t for k ≤ m/2. The algorithm chooses each such class i with a uniform probability, and subsequently selects a channel to transmit uniformly at random from a given set of 2ti channels. We concentrate on algorithms ALGk where the guesses grow by constant factors, i.e., whose estimations for t are of the following magnitudes: t = m1/k , ..., mi/k , ..., m. We begin by deriving a bound on the expected discovery time of ALGk . Theorem 1. Let m denote the and let t < m be the number of number of channels β k·ln(t) − k1 jammed channels. Let β1 = ln m , β2 = m and β3 = 2β1 − 2k − 1 for some integer value k ≤ m/2. The expected discovery time of ALGk is 1

m k β3 −

t m

2k 2 m(m1/k − 1)2 k+1 . + β3 + β2 m k + 2t + m − tβ2 m

Proof. Consider a time slot where node v1 chooses class i1 and assume node v2 chooses class i2 in the same round. If a discovery is possible in this time slot, we have i1 ≥ i2 > k·ln(t) ln m . The second inequality is due to our requirement i2 /k that m > t; otherwise the devices cannot find each other since the estimator tˆ of at least one device smaller than t/2. The probability that the two nodes successfully meet in this round is i2 /k −t 1 t p(i1 , i2 , t) = mm i1 /k mi2 /k = mi1 /k − m(i1 +i2 )/k .

Speed Dating Despite Jammers

7

The overall success probability in some round is given by ⎛ ⎛ ⎞⎞ i1 −1 k k

1 ⎝ ⎝

P[success|t] = 2 p(i1 , i2 , t) + p(i2 , i1 , t)⎠⎠ . 2k i =i i1 =β1 +1

i2 =β1 +1

2

1

1

t Expanding the sums leads to P[success|t] := (m k (2β1 − 2k − 1) − m + 2k − 2β1 − β − k1

β 1− k1

k+1 k

1+m (m + 2t + m − tm discovery time can be derived.

))/(2k 2 m(m1/k − 1)2 ), of which the expected

Since we are particularly interested in an algorithm’s competitiveness, we can examine the ratio achieved by this algorithm for k = log m. We give a brief sketch of the derivation. We distinguish three cases for t. If t = 0, the ratio is Θ(log m), verifiable by calculating 2/P[success|t = 0]. For t ∈ [1, . . . , m/2], the expected running time is O(log2 m · mt/(m − log m)), hence the ratio is O(log2 m). It remains to consider t > m/2. The expected discovery time of ALGlog m is 2 log2 mm2 /(m − t), compared to REF needing 2m2 /(m − t) time slots in expectation. Thus the ratio is at most O(log2 m) for ALGlog m . Interestingly, as we will see later, this is asymptotically optimal even for general randomized algorithms. Corollary 1. ALGlog m has a competitive ratio of at most O(log2 m). 4.4 Optimal Competitiveness We have studied how to combine algorithms tailored for a certain estimated t in order to construct efficient node discovery protocols. In particular, we have derived the execution time for a general class of algorithms ALGk . This raises two questions: What is the best competitive ratio achieved by ALGk with the best choice of k? How much do we lose compared to any algorithm solving the device discovery problem by focusing on such class estimation algorithms ALGk only? In the following, we adopt a more direct approach, and construct an optimal algo→ rithm using a probability distribution − p = (p1 , . . . , pm ), i.e., choosing a channel i → with probability pi , where p1 ≥ p2 ≥ . . . ≥ pm ≥ 0. In other words we have to find − p yielding the lowest possible competitiveness. From this analysis, we can conclude that no loss incurs when using on class algorithms ALGk , i.e., there is a class algorithm, ALGlog n with an asymptotically optimal competitive ratio. Recall that the best possible expected discovery time (cf. Lemma 1) if t channels are jammed and if t is known. Thus, in order to devise an optimal algorithm OP T , we need to solve the following optimization problem. t TALG min ρ = min max , t → − p 0≤t m/2. m m p2 In addition, it must hold that

m

i=1

i=t+1

i

pi = 1, and p1 ≥ p2 ≥ ... ≥ pm ≥ 0.

8

D. Meier et al.

We simplify the min max ρ objective function to min ρ by generating the following optimization system. min ρ such that m1

t=0:

i=1

m1

1 ≤ t ≤ m/2 :

4t

t > m/2 :

m2

and

m

≤ρ

p2i

i=t+1

m−t m

p2i

i=t+1

p2i

(1)

≤ρ

(2)

≤ρ

(3)

pi = 1, p1 ≥ p2 ≥ ... ≥ pm ≥ 0.

i=1

Observe that ρ is minimal if equality holds for all inequations in (1), (2) and (3). This yields an equation system allowing us to compute the values pi . Thus the optimal channel selection probabilities are 7 1 p1 = , pi = 4i(i−1)ρ for i∈ [2, m/2], 8ρ 1 and pj = m · ρ1 for j > m/2. m

pi = 1, the competitiveness is 2 m/2 ρ = 14 1 + 7/2 + i=2 1/ i(i − 1) .

Due to the constraint

i=1

m/2 m/2 √ Since Hm/2−1 = i=2 1/ (i − 1)2 > i=2 1/ i2 = Hm/2 − 1, where Hi is the ith harmonic number, it holds that ρ ∈ Θ(log2 m). Thus, we have derived the following result. Theorem 2. Algorithm OP T solves the device discovery problem with optimal competitiveness 2 m/2 1 1 + 7/2 + 1/ i(i − 1) ∈ Θ(log2 m). i=2 4 As mentioned above, the class algorithm ALGlog m features an asymptotically optimal competitiveness of Θ(log2 m) as well. 4.5 Optimality for Known Probability Distribution of t In the previous section, we have described an algorithm which solves the discovery problem optimally for unknown t. In the following, we continue our investigations in a slightly different model where the algorithm has a rough estimation on the total number of jammed channels. Concretely, we assume that an algorithm has an a priori knowledge on the probability distribution of the total number of jammed channels: Let p(0), p(1), . . . , p(i), . . . , p(m) be the probability that i channels are jammed. We

Speed Dating Despite Jammers

9

know from Section 4.1 that if t = i ≤ m/2 is known, the optimal discovery time is 8i in expectation. We want to devise an algorithm ALGP DF which estimates t using the distribution x0 , x1 , ..., xm/2 over the classes estimating t = i, and minimizing the expected total execution time. Let pi denote the success probability for t = i ≤ m/2, i.e., i channels are jammed. For the two classes j and l used by the two nodes, we have a success probability of max{min{2j − i, 2l − i}, 0}/(2 · 2j · 2l), since the nodes can only meet on unjammed channels. In order to compute pi , we need to sum over all possible pairs of classes multiplied with the probability of selecting them. pi = P [success|t = i] m/2−1 m/2

=

xj xl

j>i/2 l>i/2

min(2j − i, 2l − i) 8jl

m/2−1 m/2

=

m/2

2xj xl

j>i/2 l=j+1

2j − i 2 2j − i + xj · . 8jl 8j 2 j=i

For t > m/2, the expected discovery time is

m2 . x2m/2 (m−t)

following optimization problem: m/2 m min i=0 p(i)/pi + i=m/2+1 p(i) · subject to

m/2 i=0

This leaves us with the

m2 x2m/2 (m−i)

xi = 1.

Unfortunately, this formulation is still non-linear. However, there are tools available that can compute the optimal xi ’s of ALGP DF numerically using this formulation. 4.6 Multi-player Settings Scenarios with more than two nodes raise many interesting questions. One could try to minimize the time until the first two nodes have met, the time until a given node has found another given node, or the time until all nodes have had at least one direct encounter with all other nodes in the vicinity. In practice, instead of computing a complete graph where each pair of nodes has interacted directly, it might be more important to simply guarantee connectivity, i.e. ensure the existence of acquaintance paths between all pairs of nodes. In some of these models, it is beneficial to coordinate the nodes and divide the work when they meet. We leave the study of node coordination strategies for future research. However, in the following, we want to initiate the multi-player analysis with a scenario where the total number of nodes n and the total number of jammed channels t is known, and where a node u wants to find a specific other node v while other nodes are performing similar searches concurrently. Again, we assume a symmetric situation where all nodes execute the same randomized algorithm.

10

D. Meier et al.

Theorem 3. Let n be the number of nodes and assume t, the number of jammed channels, is known. If there are Ω(n + t) channels available, the asymptotically best expected discovery time is Θ(n + t). The algorithm selecting one of the first max(2t, 2n) channels uniformly at random and sending with probability 1/2 achieves this bound. Proof. Let the ith channel be selected with probability pi , and assume a given node sends (or listens) on the channel with probability ps (or with probability 1−ps ). By the same argument as presented in the previous section, there exists a randomized algorithm minimizing the expected node discovery time by selecting pi = 1/k ∀i < k for some variable k. The discovery probability if k channels are used is given by (k−t)k −2 2ps (1 − ps )(1 − ps /k)n−2 . Thus, it remains to compute ps and k. Let us start with the last factor of the success probability. Using the fact that (1 − x/n)n > e−x , we can guarantee that the term (1 − ps /k)n−2 is asymptotically constant, if ps /k ∝ n (Condition (1)). Clearly, we have to choose k > t to ensure that a meeting can happen (Condition (2)). Asymptotically, the expected discovery time is in Θ(t + n), regardless of the precise choice of k and ps —as long as the Conditions (1) and (2) are satisfied. Concretely, setting ps = 1/2 and k = max(2t, 2(n − 2)) leads to an asymptotically optimal expected discovery time. In reality, nodes typically do not know the number of nodes that are active in the same area simultaneously. What happens if we apply the optimal strategy for two nodes devised in Section 4.4, even though there might be several other nodes? Using the same arguments as in Section 4.4, we can derive that every node executing algorithm OP T is asymptotically optimal as well. Corollary 2. Let n be the number of nodes and t the number of jammed channels. Assume that n and t are unknown to the nodes. Algorithm OP T from Section 4.4 achieves an asymptotically optimal competitiveness.

5 Simulations In order to complement our formal results, we conducted several in silico experiments to study the behavior of our algorithms in different settings. In this section, we discuss our main simulation results. If not mentioned otherwise, we examine a system with 128 channels (Bluetooth uses 79 channels, 32 for discovery) and we discuss the average discovery time of 1,000 experiments. 5.1 Device Discovery In a first set of experiments, we studied the average discovery time of the optimal algorithm OP T and the algorithm using a logarithmic number of estimators or classes (ALGlog m ), see also Section 4. A simple solution to the device discovery problem typically used in practice is to select the available channels uniformly at random.Therefore, we include in our plots the algorithm U N I which has a balanced distribution over the channels.

Speed Dating Despite Jammers

11

Fig. 1. Left: Average discovery time of OP T , ALGlog m , and U N I as a function of the total number of jammed channels t. Right: Competitive ratios of OP T , ALGlog m , and U N I as a function of the total number of jammed channels t.

Figure 1 (left) shows that in case only a small number of channels is jammed, OP T and ALGlog m yield much shorter discovery time (around a factor ten for t = 0). However, as expected, the uniform algorithm U N I is much faster if a large fraction of channels are jammed. The study of the algorithms’ competitive ratio is more interesting. Figure 1 plots the ratios of the different algorithms’ discovery time divided by the optimal running time if t is known achieved by REF (cf Section 4.1). The figure shows that our optimal algorithm OP T has indeed a perfectly balanced competitiveness of around 12, independently of the number of jammed channels t. The uniform algorithm U N I is particularly inefficient for small t, but improves quickly for increasing t. However, over all possible values for t, U N I’s ratio is much worse than that of OP T and ALGlog m . Note that ALGlog m is never more than a constant factor off from the optimal algorithm OP T (a factor of around four in this example). The competitive ratio of ALGlog m reaches its maximum at t > m/2. So far, we have assumed a rather pessimistic point of view in our analysis and we considered a worst case adversarial jammer only. Figure 2 (left) studies the algorithms in a setting where a random set of t channels is jammed. Clearly, OP T and ALGlog m perform much better than U N I even for quite a large number of jammed channels. Only if the number of jammed channels exceeds 100, the average discovery time is worse. 5.2 Microwave Case Study Besides adversarial jamming attacks, a reason for collisions during the discovery phase is interference from other radio sources. It is well-known that microwave ovens interfere with Bluetooth channels (e.g., [14]), especially Bluetooth Channels 60-70. These channels are among the 32 channels that the Bluetooth protocol uses for discovery (called inquiry in Bluetooth speak). In other words, the Bluetooth protocol does not exploit the full range of available channels for discovery. The Bluetooth protocol is asymmetric, i.e., nodes either scan the inquiry channels from time to time, or they try to find nodes nearby.

12

D. Meier et al.

Fig. 2. Left: Average discovery time of OP T , ALGlog m , and U N I as a function of the total number of jammed channels t. For this plot, the jammed channels are chosen uniformly at random. Right: Multiplayer: Comparison of the average discovery time of OP T (once with randomly and once with worst-case jammed channels) to U N I if t = 10 channels are jammed.

We have conducted a case study modelling the presence of other nodes and a microwave oven. To this end, we simplified the Bluetooth inquiry protocol to its core device discovery algorithm. One node scans the channels constantly and the other node performs the Bluetooth inquiry frequency hopping pattern until they meet. Since Bluetooth only uses 32 out of the 79 available channels for discovery, our optimal algorithm is clearly in advantage by exploiting the whole range of frequencies. We ignore this advantage and consider the following set up: two nodes applying the Bluetooth inquiry protocol and two nodes executing the optimal algorithm for 32 channels seek to meet the node following the same protocol. We have counted the number of time slots Bluetooth and our optimal algorithm need until this meeting happens with and without interference by a microwave oven. We obtained the following results: Microwave BT OPT off 34.49 15.16 on 45.76 15.70 There is a substantial difference between the performance of the two protocols, especially when considering that the Bluetooth protocol is asymmetric. Hence no collisions occur on the same channels in our setting with two Bluetooth nodes. In other words, our setting is punishing the optimal algorithm for being symmetric. We believe that there are many interesting scenarios where symmetry is required and protocols following a Bluetooth approach are not suitable. 5.3 Multi-player Settings The algorithms described in Section 4 are tailored to settings where two nodes want to meet efficiently despite a adversarial jammer. However, our analysis and our experiments show (cf. Figure 2, right), that the number of time slots until two designated nodes meet increases linearly in the number of nodes in the vicinity. In large networks or times of high contentions the U N I algorithm performs much better. Thus, in these scenarios, it is beneficial to use this algorithm.

Speed Dating Despite Jammers

13

6 Conclusion The fast and robust discovery of other devices is one of the most fundamental problems in wireless computing. Consequently, a prerequisite to efficient networking are algorithms with the twofold objective of allowing devices to find each other quickly in the absence of any interference, degrading gracefully under increasing disturbance. In other words, discovery algorithms that work well in different settings and under various conditions are necessary. This paper has presented optimal algorithms for a very general, Byzantine model of communication disruptions. This implies that our algorithms can be used in many other scenarios with stronger assumptions on the nature of such disruptions. In other words, our algorithms can cope with incidental as well as with malicious interference. Furthermore, our algorithms are ideal candidates for energy and memory constrained sensor nodes as they are simple and fully distributed. Other approaches, e.g., based on exponential search techniques can outperform our protocols if the adversary is static, i.e. does not change the number of blocked channels. Another disadvantage of the exponential search technique is the fact that, in contrast to our algorithm, it requires the nodes to start the discovery protocol at the same time. Our results open many directions for future research. It is important to reason about how the first successful contact between two nodes can be used for a more efficient future communications (e.g., by establishing a shared secret key), and how, subsequently, more complex tasks can be performed over the multi-channel system.

References 1. Alnifie, G., Simon, R.: A Multi-channel Defense Against Jamming Attacks in Wireless Sensor Networks. In: Proc. 3rd ACM Workshop on QoS and Security for Wireless and Mobile Networks (Q2SWinet) (2007) 2. Alonso, G., Kranakis, E., Sawchuk, C., Wattenhofer, R., Widmayer, P.: Randomized Protocols for Node Discovery in Ad-hoc Multichannel Broadcast Networks. In: Proc. 2nd Conference on Adhoc Networks and Wireless (ADHOCNOW) (2003) 3. Alonso, G., Kranakis, E., Wattenhofer, R., Widmayer, P.: Probabilistic Protocols for Node Discovery in Ad-Hoc, Single Broadcast Channel Networks. In: Proc. 17th International Symposium on Parallel and Distributed Processing (IPDPS) (2003) 4. Awerbuch, B., Richa, A., Scheideler, C.: A Jamming-Resistant MAC Protocol for SingleHop Wireless Networks. In: Proc. 27th Symposium on Principles of Distributed Computing (PODC) (2008) 5. Bayraktaroglu, E., King, C., Liu, X., Noubir, G., Rajaraman, R., Thapa, B.: On the Performance of IEEE 802.11 under Jamming. In: Proc. 27th Joint Conference of the IEEE Computer Communication Societies (INFOCOM) (2008) 6. Bertsekas, D., Gallager, R.: Data Networks. Prentice-Hall, Englewood Cliffs (1992) 7. Chiang, J.T., Hu, Y.-C.: Cross-layer Jamming Detection and Mitigation in Wireless Broadcast Networks. In: Proc. 13th ACM Conference on Mobile Computing and Networking (MobiCom) (2007) 8. Commander, C.W., Pardalos, P.M., Ryabchenko, V., Uryasev, S., Zrazhevsky, G.: The Wireless Network Jamming Problem. In Air Force Research Laboratory, Tech. Report 07-11-06332 (2007)

14

D. Meier et al.

9. Dolev, S., Gilbert, S., Guerraoui, R., Newport, C.: Gossiping in a Multi-Channel Radio Network (An Oblivious Approach to Coping With Malicious Interference). In: Pelc, A. (ed.) DISC 2007. LNCS, vol. 4731, pp. 208–222. Springer, Heidelberg (2007) 10. Dolev, S., Gilbert, S., Guerraoui, R., Newport, C.: Secure Communication over Radio Channels. In: Proc. 27th ACM Symposium on Principles of Distributed Computing (PODC), pp. 105–114 (2008) 11. Gilbert, S., Guerraoui, R., Newport, C.: Of Malicious Motes and Suspicious Sensors. In: Shvartsman, M.M.A.A. (ed.) OPODIS 2006. LNCS, vol. 4305, pp. 215–229. Springer, Heidelberg (2006) 12. IEEE 802.15.2 Taskforce. Coexistence Mechanisms (2008), http://www.ieee802.org/15/pub/TG2-Coexistence-Mechanisms.html 13. Koo, C.-Y., Bhandari, V., Katz, J., Vaidya, N.H.: Reliable Broadcast in Radio Networks: the Bounded Collision Case. In: Proc. 25th ACM Symposium on Principles of Distributed Computing (PODC) (2006) 14. Krishnamoorthy, S., Robert, M., Srikanteswara, S., Valenti, M.C., Anderson, C.R., Reed, J.H.: Channel Frame Error Rate for Bluetooth in the Presence of Microwave Ovens. In: Proc. Vehicular Technology Conference (2002) 15. Law, C., Mehta, A., Siu, K.-Y.: Performance of a Bluetooth Scatternet Formation Protocol. In: Proc. 2nd ACM Workshop on Mobile Ad Hoc Networking and Computing (MobiHoc) (2001) 16. Law, Y.W., van Hoesel, L., Doumen, J., Hartel, P., Havinga, P.: Energy-efficient Link-layer Jamming Attacks Against Wireless Sensor Network MAC Protocols. In: Proc. 3rd ACM Workshop on Security of Ad hoc and Sensor Networks (SASN) (2005) 17. Li, M., Koutsopoulos, I., Poovendran, R.: Optimal Jamming Attacks and Network Defense Policies in Wireless Sensor Networks. In: Proc. 26th Joint Conference of the IEEE Computer Communication Societies (INFOCOM) (2007) 18. Noubir, G.: On connectivity in ad hoc networks under jamming using directional antennas and mobility. In: Langendoerfer, P., Liu, M., Matta, I., Tsaoussidis, V. (eds.) WWIC 2004. LNCS, vol. 2957, pp. 186–200. Springer, Heidelberg (2004) 19. Pelc, A., Peleg, D.: Feasibility and Complexity of Broadcasting with Random Transmission Failures. Theoretical Computer Science 370(1-3), 279–292 (2007) 20. Roberts, L.G.: ALOHA Packet System with and without Slots and Capture. SIGCOMM Computer Communication Review 5(2), 28–42 (1975) 21. Tay, Y.C., Jamieson, K., Balakrishnan, H.: Collision-Minimizing CSMA and Its Applications to Wireless Sensor Networks. IEEE Journal on Selected Areas in Communications 22(6) (2004) 22. Wood, A.D., Stankovic, J.A., Zhou, G.: DEEJAM: Defeating Energy-Efficient Jamming in IEEE 802.15.4-based Wireless Networks. In: Proc. 4th IEEE Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON) (2007) 23. Xu, W., Ma, K., Trappe, W., Zhang, Y.: Jamming Sensor Networks: Attack and Defense Strategies. IEEE Network (2006) 24. Xu, W., Wood, T., Trappe, W., Zhang, Y.: Channel Surfing and Spatial Retreats: Defenses against Wireless Denial of Service. In: Proc. 3rd ACM Workshop on Wireless Security (WiSe) (2004)

Fast Self-stabilization for Gradients Jacob Beal1 , Jonathan Bachrach2, Dan Vickery2 , and Mark Tobenkin2 1

BBN Technologies, Cambridge, MA 02138, USA [email protected] 2 MIT CSAIL, Cambridge, MA 02139, USA

Abstract. Gradients are distributed distance estimates used as a building block in many sensor network applications. In large or long-lived deployments, it is important for the estimate to self-stabilize in response to changes in the network or ongoing computations, but existing algorithms may repair very slowly, produce distorted estimates, or suﬀer large transients. The CRF-Gradient algorithm[1] addresses these shortcomings, and in this paper we prove that it self-stabilizes in O(diameter) time—more speciﬁcally, in 4 · diameter/c + k seconds, where k is a small constant and c is the minimum speed of multi-hop message propagation.

1

Context

A common building block for distributed computing systems is a gradient—a biologically inspired operation in which each device estimates its distance to the closest device designated as a source of the gradient (Figure 1).1 Gradients are commonly used in systems with multi-hop wireless communication, where the network diameter is likely to be high. Applications include data harvesting (e.g. Directed Diﬀusion[2]), routing (e.g. GLIDER[3]), distributed control (e.g. co-ﬁelds[4]) and coordinate system formation (e.g. [5]), to name just a few. In a long-lived system, the set of sources may change over time, as may the set of devices and their distribution through space. It is therefore important that the gradient be able to self-heal, shifting the distance estimates toward the new correct values as the system evolves. Self-healing gradients are subject to the rising value problem, in which local variation in eﬀective message speed leads to a self-healing rate constrained by the shortest neighbor-to-neighbor distance in the network. As a result, self-healing gradient algorithms have suﬀered from potentially very slow repair, distorted estimates, or large transients during repair. The CRF-Gradient algorithm[1,6] addresses these problems using a metaphor of “constraint and restoring force.” In this paper, we prove that the CRF-Gradient algorithm self-stabilizes in O(diameter) time—more speciﬁcally, in 4·diameter/c+k time, where k is a small 1

Note that “gradient” can mean either the vector operator or a value that changes over space (e.g. chemical concentration in a developing embryo). Historically, the operation we discuss has inherited its name from the latter use, due to its common use in biologically-inspired systems.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 15–27, 2009. c Springer-Verlag Berlin Heidelberg 2009

16

J. Beal et al.

4.5 0.5 7

0

4

3

4 4

3 4

7 7

5

4

4 4

0

7

Fig. 1. A gradient is a scalar ﬁeld where the value at each device is the shortest distance to a source region (blue). The value of a gradient on a network approximates shortest path distances in the continuous space containing the network.

constant and c is the minimum speed of multi-hop information propagation. Section 2 and Section 3 review gradients, the rising value problem, and the CRF-Gradient algorithm. Section 4, the bulk of the paper, is devoted to formal analysis of the CRF-Gradient algorithm.

2

Gradients and the Rising Value Problem

Gradients are generally calculated through iterative application of a triangle inequality constraint. In its most basic form, the calculation of the gradient value gx of a device x is simply 0 if x ∈ S gx = min{gy + d(x, y)|y ∈ Nx } if x ∈ /S where S is the set of source devices, Nx is the neighborhood of x (excluding itself) and d(x, y) the estimated distance between neighboring devices x and y. Whenever the set of sources S is non-empty, repeated fair application of this calculation will converge to the correct value at every device. 2.1

Network Model

The gradient value of a device is not, however, instantaneously available to its neighbors, but must be conveyed by a message, which adds lag. We will use the following wireless network model: – The network of devices D may contain anywhere from a handful of devices to tens of thousands. Devices are immobile and are distributed arbitrarily through space (generalization to mobile devices is relatively straightforward, but beyond the scope of this paper). The diameter of this network is the maximum graph distance between devices. – Memory and processing power are not limiting resources. – Every device has a copy of the same program, which is executed periodically to update the state of the device. Execution happens in partially synchronous

Fast Self-stabilization for Gradients

17

rounds, once every Δt seconds; each device has a clock that ticks regularly, but frequency may vary slightly and clocks have an arbitrary initial time and phase. – Devices communicate via unreliable broadcasts to all other devices within r meters distance. These devices within r are called neighbors. Broadcasts are sent once per round, halfway between executions. – Devices are provided with estimates of the distance to their neighbors, but naming, routing, and global coordinate services are not provided. – Devices may fail, leave, or join the network at any time, which may change the connectedness of the network. Note that, although we use the simplistic unit disc model for communication and assume no measurement error, the it is straightforward to extend the results in this paper to more realistic models. The results derive from the relationship between the speed at which information propagates through the network and the rate at which distance estimates increase as it propagates. Adjusting the model will only change that constants (in the algorithm and in its convergence time) that are derived from this relationship. 2.2

Separation in Space and Time

We can reformulate the gradient calculation to take our network model into account. Let the triangle inequality constraint cx (y, t) from device y to device x at time t be expressed as cx (y, t) = gy (t − λx (y, t)) + d(x, y) where λx (y, t) is the time-lag in the information about y that is available to its neighbor x. The time-lag is itself time-varying (though generally bounded) due to dropped messages, diﬀerences in execution rate, and other sources of variability. The gradient calculation is then 0 if x ∈ S(t) gx (t) = min{cx (y, t)|y ∈ Nx (t)} if x ∈ / S(t) Our deﬁnition of the set of sources S(t) and neighborhood Nx (t) have also changed to reﬂect the fact that both may vary over time. The most important thing to notice in this calculation is that the rate of convergence depends on the eﬀective speed at which messages propagate through space. Over many hops, this speed may be assumed to be close to r/Δt (cf. [7]) for networks where transmission and propagation delay is short compared to Δt , as is often the case in wireless networks. Over a single hop, however, messages may move arbitrarily slowly: the time separation of two neighbors x and y is always on the order of Δt , while the spatial separation d(x, y) may be any arbitrary distance less than r. A device and its neighbor constrain one another. Thus, when the value of a device rises from a previously correct value, it can rise no more than twice

18

J. Beal et al.

4.5 7 0.5 4

2

7

4

3

4

4.5 7 0.5

4

4

7

4

5

3

4

5

6

4

3

5 4

7

7

(b) 2 rounds

7

8.5 7 0.5

4 7

4 12 4

5

0 4

4

5

0

(a) 1 round

5.5 7 0.5

4

7

4

7

!t

4

5

3

0 4

4

4

7

9 (c) 3 rounds

4

3

8 4

12

7 4 7

4 4

5

0 4

13 (d) 9 rounds

Fig. 2. The rising value problem causes repair to be limited by the shortest edge in the network, as when the left-most device in Figure 1 stops being a source. Here updates are synchronous and unconstrained devices (black edges) attempt to rise at 2 units per round.

the distance to its closest neighbor in one round; if it rises higher, then it is constrained by the neighbor’s value. This applies to the neighbor as well, so after each round of rising the constraints are no looser. Since successive round trips between neighbors must take at least Δt seconds, a pair of neighbors constrain one another’s distance estimates to rise at a rate no greater than 2d(x, y)/Δt meters per second. When a device x has a value less than the correct value g¯x , its time to converge is at least max{(g¯x − gx (t))

Δt |y ∈ Nx (t)} 2d(x, y)

which means that close neighbors can only converge slowly. Worse, the dependency chains from this retarded convergence can reach arbitrarily far across the network, so that the entire network is limited in its convergence rate by the closest pair of devices. We will call this phenomenon the rising value problem (illustrated in Figure 2). This can be very bad indeed, particularly given that many proposals for large networks involve some randomness in device placement (e.g. aerial dispersal). Consider, for example, a randomly distributed sensor network with 100 devices arranged in a 10-hop network with an average of 50 meters separation between devices that transmit once per second. Let us assume that the random

Fast Self-stabilization for Gradients

19

distribution results in one pair of devices ending up only 50cm apart. If the source moves one hop farther from this pair, increasing the correct distance estimate by 50 meters, then the close pair and every device further in the network will take 1 at least 50 2·0.5 = 50 seconds to converge to the new value. If they had landed 5cm apart rather than 50cm, it would take over 500 seconds to converge—nearly 10 minutes!

3

The CRF-Gradient Algorithm

The CRF-Gradient algorithm avoids the rising value problem by splitting the calculation into constraint and restoring force behaviors (hence the acronym CRF). When constraint is dominant, the value of a device gx (t) stays ﬁxed or decreases, set by the triangle inequality from its neighbors’ values. When restoring force is dominant, gx (t) rises at a ﬁxed velocity v0 . The behavior mode is indicated by the “velocity” vx (t) of a device’s value and the switch between behaviors is made with hysteresis, such that a device’s rising value is not constrained by a neighbor that might still be constrained by the device’s old value. This switch is implemented by deﬁning the subset of neighbors Nx (t) allowed to exert constraint as: Nx (t) = {y ∈ Nx (t)|cx (y, t) + (λx (y, t) + Δt ) · vx (t − Δt ) ≤ gx (t − Δt )} The hysteresis comes from the vx term: when rising, vx (t − Δt ) is positive and the constraint is loosened by the amount a device’s value might rise while information is making a round trip between the device and its neighbor. Then CRF-Gradient may be formulated ⎧ 0 if x ∈ S(t) ⎨ / S(t), Nx (t) = ∅ gx (t) = min{cx(y, t)|y ∈ Nx (t)} if x ∈ ⎩ gx (t − Δt ) + v0 Δt if x ∈ / S(t), Nx (t) = ∅ ⎧ if x ∈ S(t) ⎨0 0 if x ∈ / S(t), Nx (t) = ∅ vx (t) = ⎩ v0 if x ∈ / S(t), Nx (t) = ∅ These update equations avoid the rising value problem: the value of a device rises smoothly, overshoots by a small amount, then snaps down to its correct value. 3.1

Other Self-healing Gradients

Self-healing gradients be categorized into two general approaches: either incremental repair or invalidate and rebuild. CRF-Gradient is an example of incremental repair: at each step, devices attempt to move their values up or down towards the correct value. Other work on incremental repair (by Clement and

20

J. Beal et al.

Nagpal[8] and Butera[9]) has measured distance using hop-count—eﬀectively setting d(x, y) to a ﬁxed value and therefore producing a consistent message speed through the network—and suﬀer from the rising value problem if generalized to use distance instead of hop-count. A hybrid solution in [10] adds a ﬁxed amount of distortion at each hop, exchanging the rising value problem for inaccurate values. An invalidate and rebuild gradient discards previous values and recalculates sections of network from scratch, avoiding the rising value problem by only allowing values to decrease. For example, GRAB[11] uses a single source and rebuilds when its error estimate is too high, and TTDD[12] builds the gradient on a static subgraph, which is rebuilt in case of delivery failure. These approaches work well in small networks and are typically tuned for a particular use case, but the lack of incremental maintenance means that there are generally conditions that will cause unnecessary rebuilding, persistent incorrectness, or both.

4

Analysis

We show that CRF-Gradient converges in O(diameter) time by proving selfstabilization, where the network converges to correct behavior from an arbitrary starting state. Self-stabilization also gives an upper bound on the rate at which the algorithm can adapt to changes in the network or the source region. In other work[1], we have veriﬁed the expected behavior of CRF-Gradient both in simulation and on a network of Mica2 Motes. A technical report[6] outlines a proof of self-stabilization under a continuous space/time abstraction. 4.1

Algorithm State

In order to prove self-stabilization, we must ﬁrst make explicit what state is stored at devices—the mathematical formulation leaves this implicit. There are a total of nine variables used by CRF-Gradient, plus the phase timer used to schedule the next event on a device, all summarized in Table 1. Table 1. Variables and ranges used by CRF-Gradient: self-stabilization begins with arbitrary values in all state variables Variable Type Range Δt Constant (0, ∞) v0 Constant (0, ∞) S(t) Input {true, f alse} per device gx (t − Δt ) State [0, ∞) vx (t − Δt ) State {0, v0 } Nx (t) State [see below] λx (y, t) State [0, ∞) gy (t − λx (y, t)) State [0, ∞) d(x, y) State (0, r] phase State [0, Δt )

Fast Self-stabilization for Gradients

21

These state variables may be considered in three diﬀerent categories: local, neighborhood, and algorithmic. Since the phase just determines relative order of execution, any possible value of phase is consistent with the algorithm. The neighborhood variables can be implemented several diﬀerent ways. For this discussion, we assume that the messages broadcast by each device contain its unique ID, gx (t), and whatever localization information is needed to allow the receiver to compute d(x, y). Nx (t) is then the set of unique IDs in a device’s record of its neighborhood. When a message arrives from a neighbor, a device adds the information to its neighborhood with λx (y, t) = −phase+Δt/2, replacing any previous information about that neighbor. Before each execution, Δt is added to the λx (y, t) of each neighbor. If λx (y, t) goes above a ﬁxed timeout T , the neighbor is deleted. Neighborhood state is correct if the values for each neighbor reﬂect the history and physical location of that device, and this simple mechanism guarantees that, from arbitrary state, this will become the case once T seconds have elapsed: neighbors refresh their state each round and entries for non-existent neighbors are ﬂushed after T seconds. We shall assume that T is a small constant and neglect it henceforth. This leaves only the algorithmic variables, gx (t − Δt ) and vx (t − Δt ). The latter is correct whenever it reﬂects the amount that gx changed in the last update, and thus becomes correct after a single round. The correct behavior of gx (t − Δt ) depends on the source region S(t). If the source region is nonempty, then its value must be equal to d(x, S(t)). If the source region is empty, then every value in the network must ﬂoat upwards: formally, there must exist a time tf such that for every x ∈ D, the gradient value rises at v0 thereafter: gx (t) ≥ gx (tf ) + v0 (t − tf − Δt ). Subtracting v0 Δt in the expression allows the rise to occur in discrete steps. The next section is devoted to showing the self-stabilization of gx . 4.2

Proof of Self-stabilization

From any arbitrary starting state, the network of devices converges to correct behavior in O(diameter) time—speciﬁcally, in time less than 4 · diameter/c + k, where c is the minimum speed of message propagation in meters per second and k is a small constant. We prove this by ﬁrst ﬁnding upper and lower bounds on how quickly information propagates through the network. We use these bounds to show that the minimum values of a network quickly constrain the values of all other devices, and that without a constraining source the minimum values rise steadily. Together, these lead to a proof that the network rapidly converges to either correct values or a steady rise, depending on whether any sources exist. For the purposes of this proof, we will assume that the source region remains ﬁxed, there are no failures, that clock frequency does not vary between devices, and that neighbor distance estimates have no error. A reminder of terms from the network model in Section 2.1: D is the network of devices that execute once every Δt seconds, broadcasting to all neighboring devices within r meters on the half-round. We will additionally augment our network model with the following deﬁnitions and assumptions:

22

J. Beal et al.

– Messages are assumed to arrive instantly, with overhead time absorbed into the 1/2 round delay between execution and transmission. – The distance between non-neighbors is deﬁned recursively, through the network: d(x, y) = min({d(x, z) + d(z, y)|z ∈ Nx (t)}). The distance between regions will be the minimum of the distance between pairs of devices in each region. – No device has two neighbors on any ray emanating from itself.2 This ensures that the release of constraint propagates quickly across multiple hops. – gX (t) is the set of gradient values in a set of devices X ⊆ D at time t. Likewise, dX (Y, Z) is the minimum distance between devices in Y and Z on paths conﬁned to X. – We will deﬁne the forward lag Lx (y, t) to be the time-lag between an event at device x at time t and the next equivalent event at device y where the value gx (t) can constrain gy (t) along a path equal to d(x, y). For neighboring devices, Lx (y, t) is always in the range 12 Δt to 32 Δt , and Lx (y, t) = λy (x, t + Lx (y, t)). Across multiple hops, we deﬁne Lx (y, t) recursively as Lx (y, t) = min({Lx(z, t) + Lz (y, t + Lx (z, t))|z ∈ Nx (t) s.t. d(x, z) + d(z, y) = d(x, y)}) – The restoring velocity v0 is bounded by v0 ≤ c/4. Given these deﬁnitions, we can begin the proofs, starting with a bounding of speeds over multi-hop distances. Lemma 4.1 (Multi-Hop Speed). Given devices x, y ∈ D at time t, where x and y are not neighbors, the speed at which information propagates across the shortest path between them is bounded below by c = 13 Δrt and above by C = 2 Δrt . Proof. Assume without loss of generality that information is propagating from x to y, and consider the chain of hops between x and y along the shortest path. Each pair of successive hops must move more than r distance or else the ﬁrst element of the pair could be omitted. Thus the total number of hops is strictly less than 2d(x, y)/r. The time-lag across a single hop is at most 32 Δt , between two devices with phases in the the worst alignment. Multiplying time per hop by number of hops, we see that the total time to propagate across distance d(x, y) is strictly less than 3 Δt ·d(x,y) . r Speed is distance divided by time, so we may establish a lower bound: c = d(x, y)/3

Δt · d(x, y) r

1 r 3 Δt The upper bound proceeds similarly, with the maximum distance per hop r and the lowest time-lag across a single hop 12 Δt , yielding a bound of: c=

C=2 2

r Δt

Such a network can be produced by adding a small amount of randomness to location.

Fast Self-stabilization for Gradients

23

We now show a loose bound for how the least values in a region bound the values of the whole region over time: Lemma 4.2. Let R ⊆ D. At time t0 , let g0 = min(gR (t0 )), and define the minimum region M as the set of devices with minimal value, M = {x|x ∈ R, gx (t0 ) = g0 }. Then at time t > t0 + 32 Δt , every device z ∈ R with dR (z, M ) < c · (t − t0 ) has value gz (t) < g0 + v0 Δt + dR (m, z) + v0 · (4Δt dR (m,z) + t − t0 ) r Proof. Consider a pair of neighboring devices, x, y ∈ D. If x executes at time tx , producing value gx (tx ), then at time t + Lx (y, tx ) y executes, producing a value gy (tx + Lx (y, tx )) < gx (tx ) + d(x, y) + v0 · (Lx (y, tx ) + 2Δt ) This bound is the decision threshold for constraint, raised by one round of restoring force. Now consider an arbitrary pair of non-neighboring devices, m ∈ M and z ∈ D. By Lemma 4.1, we know that if dR (m, z) < c · (t − t0 ) (and the elapsed time is enough to go at least one hop), then values from m will have time to propagate constraint to z along a shortest path between them. Because m has value g0 at time t0 , we know that at a time tm ∈ (t0 , t0 + Δt ] it must compute a value gm (tm ) ≤ g0 + v0 Δt . The ﬁrst execution at z that can be constrained along the shortest path by the value gm (tm ) occurs at time tm + Lm (z, tm ), which we will call tz . Accumulating the neighbor constraint across at least 2 dR (m,z) hops, we thus r have the following constraint on the value of z: dR (m, z) gz (tz ) < gm (tm ) + dR (m, z) + v0 · Lm (z, tm ) + 4Δt r For an arbitrary t > tz , this can have risen to at most: z) gz (t) < gm (tm ) + dR (m, z) + v0 · (Lm (z, tm ) + 4Δt dR (m,z) + Δt (t−t ) r Δt where the ﬂoor is due to the fact that tz is the time of an execution. Eliminating the ﬂoor and substituting for tz we have:

gz (t) < gm (tm ) + dR (m, z) + v0 · Lm (z, tm ) + 4Δt dR (m,z) + t − Lm (z, tm ) − tm r dR (m, z) gz (t) < gm (tm ) + dR (m, z) + v0 · 4Δt + t − tm r

Substituting in the deﬁnitions for tm and gm (tm ) gives dR (m, z) gz (t) < g0 + v0 Δt + dR (m, z) + v0 · 4Δt + t − t0 r Conversely, we can show how quickly values will rise when there are no constraints.

24

J. Beal et al.

Lemma 4.3 (Floating Island Lemma). Given a region R ⊆ D − S(t0 ) with no sources at time t0 , let g0 be the minimum value of gR (t0 ). Unless acted on by a constraint from a source outside of R, the gradient value for every device x ∈ R at time t > t0 is gx (t) ≥ g0 + v0 · (t − t0 − Δt ) Proof. Assume for contradiction that this is false: then there must be some device x ∈ R that executes at time t > t0 such that gx (t) < g0 + v0 · (t − t0 − Δt ). Since there are no sources in R, the value gx (t) must have been calculated using either gx (t − Δt ) (if unconstrained) or gy (t − λx (y, t)) (if constrained). Iterating this, we can construct a dependency chain for gx (t) of constrained and unconstrained steps going backward to time t0 , grounding in an execution (real or apparent from phase) that occurs in the range (t0 − Δt , t0 ]. Assume this chain consists entirely of unconstrained steps. Each step goes 0 backward in time Δt , so the number of steps backward is t−t Δt . Each of these steps decreases the value by v0 Δt , so we have

t − t0 gx (t0 ) = gx (t) − v0 Δt · Δt 0 Since the ceiling operator may raise the value of t−t Δt by as little as zero, we know that t − t0 gx (t0 ) ≤ gx (t) − v0 Δt · Δt

gx (t0 ) ≤ gx (t) − v0 · (t − t0 ) and substituting in our assumption for gx (t) produces gx (t0 ) < g0 + v0 · (t − t0 − Δt ) − v0 · (t − t0 ) gx (t0 ) < g0 − v0 Δt which is a contradiction since g0 is the minimum value. Since each unconstrained step lowers the value by v0 Δt , at least two unconstrained steps must be replaced by constrained steps. Steps need not be replaced at a 1:1 ratio, but the replacement steps must cover the same time-span–in this case at least 2Δt . Each unconstrained step can take a maximum of 32 Δt seconds, so there must be at least two such steps. Between them, these steps must cover a distance of at least r (otherwise the ﬁrst and last devices would be neighbors, and since there are no collinear 3-cliques, the dependency chain could not visit the middle device). Replacing unconstrained steps with constrained steps thus can only decrease the distance if r < 2v0 Δt , which is false by assumption. Theorem 4.4. The CRF-Gradient algorithm self-stabilizes in 4·diameter/c+ k time, where k is a small constant. Proof. First, note that once a device is constrained by the source, it will always be constrained by the source—it can only relax towards a shorter path. The relaxation is ﬁnished within the transit time of information along the shortest path to the source.

Fast Self-stabilization for Gradients

25

Let t0 be the time when self-stabilization begins. If the source region is not empty, then by Lemma 4.3 we know that every device x not constrained by a source has a value at time t of gx (t) ≥ v0 · (t − t0 − Δt ). No device in the network needs a value greater than diameter, so they must rise to at most diameter + 52 v0 Δt in order to become constrained by a source. Setting gx (t) to this target value, 5 diameter + v0 Δt = v0 · (t − t0 − Δt ) 2 we solve for t − t0 :

7 diameter + v0 Δt = v0 · (t − t0 ) 2 diameter 7 (t − t0 ) = + Δt v0 2

We thus have a race between two processes, the outward ﬂow of constraint from the source region and the upward rise of gx (t) value which are below their ultimate level. By Lemma 4.1, we know that constraint ﬂows outward across multiple hops at a minimum speed of c = 13 Δrt . Since this propagation rate is much faster than v0 , we may expect that any distant device x will be constrained immediately after it rises above d(x, S) + 72 v0 Δt , bringing the total time for stabilization to at most diameter 11 + Δt v0 2 (adding in a round to go above the threshold and another to snap down to the constraint). Given our assumption that v0 ≤ c/4, we take v0 = c/4 to yield a bound of 4 · diameter/c + 11 2 Δt Now consider the case when the source region is empty. Let tf = t0 +4 diameter . c Assume that tf is not an acceptable time: this means there is some device x ∈ D and time t such that gx (t) < gx (tf ) + v0 · (t − tf − Δt ). Constructing dependency chains, as in Lemma 4.3, the same logic shows that gx (t) must have at least one constraint step going through a neighbor y that occurs after time tf . Since y is a neighbor, either the y is also violating the bound (and thus must be constrained by a neighbor of its own at an earlier time), or else the execution at which the constraint is applied happens precisely once, at the time tx during the period (tf , tf + Δt ]. Assume the latter case (the former reduces to it by switching which device is under consideration). Because y constrains x at tx , y must not have been rising in its previous execution—otherwise the diﬀerence between y and x must have shrunk, meaning x is not constrained by y, or stayed the same, meaning x was rising also and will still not be constrained by y. Thus the next step of the dependency chain must be a constraint step to some neighbor z of y. We can apply the same argument iteratively to show the dependency chain must be all constraint steps back to the base time t0 . There must be at least 4 diameter of these steps, each pair moving cΔt at least r distance (by the same argument as in Lemma 4.3). This gives us a bound on the post-constraint value of x: gx (tx ) ≥ g0 + 2r diameter , where g0 is cΔt

26

J. Beal et al.

the minimum gradient value in the network at time t0 . Substituting in c = we can simplify this: diameter gx (tx ) ≥ g0 + 2r 1 r 3 Δt Δt

1 r 3 Δt ,

gx (tx ) ≥ g0 + 6 · diameter By Lemma 4.2, we know that because there is a minimum g0 in the network at time t0 , that at time tf the value of x is bounded by diameter gx (tf ) < g0 + v0 Δt + diameter + v0 · 4Δt + tf − t0 r Since we know that gx (tf ) > gx (tx ) + v0 Δt (or else it’s not a violation), we can connect these two equations together to get g0 + 6 · diameter + v0 Δt < g0 + v0 Δt + diameter + v0 · 4Δt diameter + tf − t0 r diameter 5 · diameter < v0 · 4Δt + tf − t0 r diameter 5 · diameter < v0 · 16 c 5 v0 > c 16 which is false by assumption. Thus the no-source case has a convergence time bounded above by 4 · diameter/c. From the structure of these proofs, it appears that they should generalize to show self-stabilization for CRF-Gradient on devices with error in distance measurements and clock rate, as well as with non-unit disc communication, albeit with a great deal more proof complexity and producing slightly worse bounds. The permissiveness of the algorithm’s constraints will need to be increased slightly to allow for error as well, however.

5

Contributions

We have proved that the CRF-Gradient algorithm self-stabilizes in O(diameter) time—more speciﬁcally, in 4 · diameter/c + k time, where k is a small constant, c is the minimum speed of multi-hop information propagation, and the restoring velocity is bounded v0 ≤ c/4. This result also implies fast self-healing following changes in network structure or source region, and the incremental nature of the repair means that there will often be useful values even while repair is going on. In other work[1], we have veriﬁed that CRF-Gradient exhibits the predicted behavior both in simulation and on a network of Mica2 motes. The algorithm can also be generalized and applied to create other self-healing calculations, such as cumulative probability ﬁelds. This approach may be applicable to a wide variety of problems, potentially creating more robust versions of existing algorithms and serving as a building block for many distributed computing applications.

Fast Self-stabilization for Gradients

27

References 1. Beal, J., Bachrach, J., Vickery, D., Tobenkin, M.: Fast self-healing gradients. In: ACM Symposium on Applied Computing (March 2008) 2. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diﬀusion: A scalable and robust communication paradigm for sensor networks. In: Sixth Annual International Conference on Mobile Computing and Networking (MobiCOM 2000) (August 2000) 3. Fang, Q., Gao, J., Guibas, L., de Silva, V., Zhang, L.: Glider: Gradient landmarkbased distributed routing for sensor networks. In: INFOCOM 2005 (March 2005) 4. Mamei, M., Zambonelli, F., Leonardi, L.: Co-ﬁelds: an adaptive approach for motion coordination. Technical Report 5-2002, University of Modena and Reggio Emilia (2002) 5. Bachrach, J., Nagpal, R., Salib, M., Shrobe, H.: Experimental results and theoretical analysis of a self-organizing global coordinate system for ad hoc sensor networks. Telecommunications Systems Journal, Special Issue on Wireless System Networks (2003) 6. Beal, J., Bachrach, J., Tobenkin, M.: Constraint and restoring force. Technical Report MIT-CSAIL-TR-2007-042, MIT (August 2007) 7. Kleinrock, L., Silvester, J.: Optimum transmission radii for packet radio networks or why six is a magic number. In: Natl. Telecomm. Conf., pp. 4.3.1–4.3.5 (1978) 8. Clement, L., Nagpal, R.: Self-assembly and self-repairing topologies. In: Workshop on Adaptability in Multi-Agent Systems, RoboCup Australian Open (January 2003) 9. Butera, W.: Programming a Paintable Computer. PhD thesis, MIT (2002) 10. Bachrach, J., Beal, J.: Programming a sensor network as an amorphous medium. In: Distributed Computing in Sensor Systems (DCOSS) 2006 Poster (June 2006) 11. Ye, F., Zhong, G., Lu, S., Zhang, L.: Gradient broadcast: a robust data delivery protocol for large scale sensor networks. ACM Wireless Networks (WINET) 11(3), 285–298 (2005) 12. Luo, H., Ye, F., Cheng, J., Lu, S., Zhang, L.: Ttdd: A two-tier data dissemination model for large-scale wireless sensor networks. Journal of Mobile Networks and Applications (MONET) (2003)

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations Hosam Rowaihy1, Matthew P. Johnson2 , Diego Pizzocaro3, Amotz Bar-Noy2 , Lance Kaplan4 , Thomas La Porta1 , and Alun Preece3 1 2

Dept. of Computer Science and Engineering, Pennsylvania State University, USA Dept. of Computer Science, Graduate Center, City University of New York, USA 3 School of Computer Science, Cardiff University, UK 4 U.S. Army Research Laboratory, USA

Abstract. Sensor networks introduce new resource allocation problems in which sensors need to be assigned to the tasks they best help. Such problems have been previously studied in simplified models in which utility from multiple sensors is assumed to combine additively. In this paper we study more complex utility models, focusing on two particular applications: event detection and target localization. We develop distributed algorithms to assign directional sensors of different types to multiple simultaneous tasks using exact location information. We extend our algorithms by introducing the concept of fuzzy location which may be desirable to reduce computational overhead and/or to preserve location privacy. We show that our schemes perform well using both exact or fuzzy location information.

1 Introduction Mission-centric sensor networks present many research challenges. One such challenge is how to best assign sensors to tasks, considering that there may be multiple tasks, of different priorities and information needs, running concurrently in the network, and sensors of multiple types available to meet those needs. Tasks may require one or more sensors, possibly of different types. Given this multiplicity of task types and needs, our goal is to assign specific sensors to the tasks in order to maximize the utility of the sensor network. This is especially challenging in environments that use directional sensors as each sensor in this case can be assigned to at most one task. In this paper, we consider the problem of assigning directional sensors to tasks. In particular, we focus on two applications, event detection and target localization. We propose distributed algorithms for assigning specific sensors to tasks of both types. For the two problems we consider a case in which the exact location of the sensors is known, and one in which only an approximation of the location is disclosed (we term this fuzzy location). Assignment algorithms based on the exact location lead to better solutions and higher overall performance. In certain cases, however, such schemes are not feasible, for two reasons. First, exact location creates a large problem instance in which each sensor is considered on its own, which leads to a higher computational cost. This can be impractical due to the limited computational capabilities of sensors. When fuzzy location is used, however, nearby sensors can be clustered based on their B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 28–43, 2009. c Springer-Verlag Berlin Heidelberg 2009

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

29

fuzzy location, thus coarsening the problem instance, and requiring the consideration of fewer assignment choices. Second, exact location may not be disclosed for privacy reasons. Consider a scenario in which a sensor network deployed for a sensitive task (e.g. monitoring national borders). The owners of the network might like to share the sensors with other entities (e.g. researchers collecting environmental data) but at the same time be reluctant to reveal precise information about the location of their assets. By accommodating fuzzy location, we enable such sharing of resources. Different location granularity levels provide trade-offs between performance, and efficiency/privacy. Contributions. We provide a formal definition of event detection and localization problems. We propose two distributed algorithms, one for the event detection task, and one for the localization task when exact sensor locations are disclosed. The first is guaranteed to provide a 2-approximation; in simulation they both achieve close to optimal performance. We extend the algorithms to cases in which only fuzzy locations of sensors are used. This entails defining the notion of fuzzy location with respect to detection and localization. We show through simulation that, as the granularity of fuzzy location is refined, performance improves to a point after which the gain is insignificant.

2 Related Work In the past sensor-task assignment problems in wireless sensor networks have been studied mainly using simplified models in which utility from multiple sensors is assumed to combine additively [8, 3, 11]. [8] uses distributed approaches assign individual sensors to tasks, assuming additive utility and no competition for the same sensing resources between tasks. A problem variant motivated by frugality and conservation of resources is addressed in [11]. In this paper, we consider more complex models to evaluate the utility of a bundle of sensors, and show how such problems can be solved, even based on inexact sensor location information. Directional sensors with tunable orientations have recently be addressed for coverage [2] and target tracking [6] problems separately. For non-directional sensors, both [1] and [10] propose algorithms to provide a certain level of (cumulative) detection probability over an area using. Target localization problems have also been previously considered, e.g. in [25], which develops a solution using a prior distribution of target location and exact sensor locations. Their solution, however, is centralized. A distributed solution for the localization problem is proposed in [13], but it does not consider competition on resources between multiple simultaneous tasks. Our problem is analogous to the well known Multi-Robot Task Allocation (MRTA) problem described in [9]. A sensor can be seen as a resource-constrained robot as suggested in [19], specifically the problem ST-MR-IA of [9], i.e. Single-Task robots (ST) performing Multi-Robot tasks (MR) using Instantaneous Assignment (IA). The MRTA taxonomy solutions, however, do not scale well to large numbers of sensors and tasks. Our M AX CDP problem (defined below) lies within a family of submodular Combinatorial Auctions. Guaranteed approximation algorithms are known for this class of problems (see for example [18] and references therein). Our focus here, however, is on designing algorithms that provide near-optimal performance in an efficient, distributed manner. Some related problems involving cumulative probabilities are

30

H. Rowaihy et al.

considered in [7], but those problems involve the product of task success probabilities instead of the sum. To our knowledge, we are the first to introduce the concept of fuzzy sensor location for sensor-task assignment problems. Related works in this area include [23], which addresses the issue of privacy when fusing data coming from sensors that are assigned to multiple event detection tasks, and [20], which describes a data dissemination technique to ensure that the locations of sensors in the network are not learned by an enemy.

3 Overview In this section we provide an overview of our network model. Then we discuss the different task types that can be present in the network. Network Model. The network consists of static sensors of different types. The deployed sensors are directional in nature. Examples of such sensors include imaging sensors, which can be used for event detection, and directional acoustic sensor arrays. Thus, we assume that a sensor or a bundle of sensors can be assigned to at most one task at a time. We also assume that sensors know their location. In our model, a task is specified by a geographic location and a task type, for example, detecting events occurring at location (x, y) or accurately localizing a target within a small area known to contain the target’s estimated location. A larger-scale mission, such as field coverage or perimeter monitoring, can be divided into a set of tasks, each having its own location. Because tasks can vary in importance, we allow a sensor to be reassigned from a task with lower profit (which is used to represent importance) to a task with a higher profit. However, since some tasks are more sensitive to interruption in service, preemption should be limited to tasks that can tolerate such interruption. For example, localization is very sensitive to interruption whereas long-term detection is less so. Task Types. In this paper, we focus on the sensor-task assignment algorithms, and we assume that the process of matching the capabilities of a sensor or a bundle of sensors with the requirements of a task is carried out by a Knowledge Base System [21]. In the network, there may be multiple types of tasks, each having different sensing requirements. Some task types may only require that the assigned sensors are close to the target. Others may require that the collection of sensors form a specific shape, such as in localization. The specific characteristics of a given task’s requirements naturally allow us to restrict our attention to just the applicable sensors. These characteristics are: (1) type of data required, (2) distance from the target, and (3) relative angles between sensors. Together the second and third properties allow the creation of any polygon shape out of the selected sensors to satisfy the requirements of complex tasks. We consider below two types of tasks incorporating the three requirements. The first task we consider is an event detection task in which the goal is to detect activity in a specific location. This task can be accomplished using one or more sensors. Each sensor has a detection probability that depends on its type and distance from the target. A collection of sensors can be combined together to improve the detection probability.

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

31

The second task type we consider is a target localization task, whose goal is to accurately localize a target within the small area where it is expected, perhaps prompted by the detection of an event in this area or by some prior knowledge. This type of task requires at least two sensors. An interesting property of this task type is that assignment quality depends not only on sensor type and separating distance but also on the angle between the selected sensors. In the model of [16], for example, two sensors perform optimally if they are separated by a 90◦ angle and are as close to the target as possible.

4 Problem Definition In this section, we formulate the two sensor-task assignment problems, both of which involve attempts to assign sensor bundles to tasks in the best possible way. In general, we consider the state of the network at an instance of time in which multiple simultaneous tasks can be ongoing. We note that the generic problem of assigning sensor bundles to tasks is a generalized version of the Semi-Matching with Demands (SMD) problem [3], and is NP-hard, even to approximate. 4.1 Event Detection Tasks The goal of the detection task is to detect events in a specific location with the highest probability. [24] gives a complex model of sensor assignment, with an objective function based on the probability of detecting certain kinds of events, conditioned on the events occurring and the number of sensors assigned to detect the event in a given location. We extract the kernel of this problem as follows. Given are collections of sensors and tasks. Each task is to monitor and detect events, if they occur, in a certain location. The utility of a sensor to a task is the detection probability when the event occurs. Let Si → Tj indicate that sensor i is assigned to task j and let pj indicates Tj ’s profit. The objective function is then to maximize the sum of cumulative detection probabilities for tasks (weighted by task profits), given the probability eij that a single sensor Si detects an event for Tj : pj (1 − (1 − eij )) (1) j

Si →Tj

We call this the Cumulative Detection Probability maximization problem (M AX CDP). Here the utilities are monotonic increasing as sensors are assigned but nonlinear. Proposition 1. M AX CDP is strongly NP-hard. Proof. We reduce from P RODUCT-PARTITION [17], which is strongly NP-hard (unlike the ordinary PARTITION problem). The input instance is a set of n positive integers A. The decision problem is to decide whether it is possible to partition them into two subsets S, S = A − S of equal products, i.e., ai ∈S ai = ai ∈S ai . Given the input instance, we produce a M AX CDP instance as follows. There are n sensors and 2 tasks, both with pj = 1. Each sensor Si has success probability ui = 1 − 1/ai for both tasks. Then we have that maximizing: pj (1 − (1 − ei,j )) = (1 − 1/ai ) j=1,2

Si →Tj

j=1,2

Si →Tj

32

H. Rowaihy et al.

is the same as minimizing:

1/ai +

Si →T1

1/ai

(2)

Si →T2

%vspace-0.2cm We claim that in an optimal solution, these two products are as close as possible. To 1 1 see why, consider two values t1 = Ax and t2 = By for the two terms added in Eq. 2, where x, y are particular ai values with A < B and x < y. Then such a solution must be suboptimal since a “local move” bringing the products closer to equality will strictly reduce their sum: 1 1 Ax + By Ay + Bx 1 1 + = > = + Ax By ABxy ABxy Ay Bx But the terms in Eq. 2 are equal iff we have Si →T1 ai = Si →T2 ai Therefore by solving M AX CDP we can decide P RODUCT-PARTITION. We emphasize that the hardness result remains even for geometric instances, indeed, even if sensors and tasks lie on a line. If the detection probability depends on distance then the instances of the reduction can be constructed by placing the two tasks (of different types) at the same point and placing the sensors at distances that yield the desired probabilities. 4.2 Target Localization Tasks For target localization through triangulation of the bearing measurements, two or more sensors that are not collinear with the target are necessary to ensure full observability of the target’s location. The expected mean squared error when incorporating imperfect bearing measurements is well understood [12, 14]. Specifically, it can be shown that when the bearing measurements are modeled as the true bearings embedded in additive white Gaussian noise (AWGN) of mean zero and variance σ 2 , then the error covariance of the (x, y) location of the target is approximately: R=

n i=1

1 σ 2 d2i

cos2 θi − cos θi sin θi − cos θi sin θi sin2 θi

−1

where di and θi are the distance and bearing, respectively, from the target event to the i-th sensor. We choose to model the uncertainty in the calculated target location, U , as a function of the expected mean squared error (MSE), which is simply U = trace{R}. Alternatively, the uncertainty could be U = det{R} as described in [16]. We prefer the trace because of its physical interpretation as the MSE and because it bounds the determinant. In this paper, we consider the case in which only two sensors are used for localization, which in most cases provide enough accuracy. More sensors lead to better accuracy but for the purpose of testing our algorithm with exact and fuzzy location two sensors are sufficient. For the case of two sensors, the uncertainty is given by: d21 + d22 U =σ (3) | sin(θ1 − θ2 )|

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

33

Note that σ is simply a scaling constant that without loss in generality we ignore by setting to 1. With U thus defined, quality is maximized when the separating angle is 90◦ and the distances are minimal. Of course, the weights described are just one model. More generally, allowing arbitrary utility weights on triples t = (si , mj , sk ), the following is easily obtained, by reduction from M AXIMUM 3D M ATCHING: Proposition 2. Choosing a max-weight set of disjoint sensor/task triples is NP-hard.

5 Distributed Algorithms In this section we introduce our algorithms for assigning sensors to tasks. We start by discussing the basic operation of the proposed solutions. Then we discuss in detail how to solve the detection and localization problems in the case in which the exact sensor locations are known and the case in which only fuzzy locations are known. 5.1 Basic Operation The solutions we propose are distributed in nature and do not require any central node to make all the assignment decisions. This allows the leveraging of real-time status information about sensors – which are operational and which are currently assigned to other tasks. Such solutions are also scalable and more efficient in terms of communication cost compared to centralized approaches. We assume a dynamic system, in which the tasks constituting the problem instances described above arrive and depart over time. At any time instance, there may be tasks of the two types present in the network. For each task, a leader is chosen, which is a sensor close to the task’s location. The leader can be found using geographic-based routing techniques [5, 15]. If location privacy is a concern, then schemes such as [22] can be used. The task leaders are informed about their tasks’ types, locations and profits by a base station. Each task leader runs a local process to match nearby sensors to the requirements of the task. Since the utility a sensor can provide to a task is limited by a finite sensing range Rs , only nearby sensors are considered. The leader advertises its task information to the nearby sensors (e.g. sensors within two hops). The ad message contains the task type, its location, its profit and location requirement (i.e. exact or fuzzy). Nearby sensors hearing this ad message will propose to the task with their locations, which may be exact or fuzzy depending on the algorithm used. A sensor assigned to an interruptible task may be reassigned to another task if doing so will increase the total profit of the network. This is determined as follows: if the utility sensor Si provides to the incoming task Tj weighted by Tj ’s profit pj is greater than that of the current task Tk then Si should be reassigned. More formally, if eij pj > eik pk then Si is reassigned. We allow both localization tasks and detection tasks to preempt detection tasks; neither type can preempt a localization task. To reduce both the interruption of ongoing tasks and the communication overhead, no cascading preemption is allowed. That is, if task Tj preempts task Tk , Tk will try to satisfy its demand only with available sensors rather than by preempting a third task. When a task ends, the leader sends out a message to advertise that the task has ended and all its assigned sensors are released. Because the system is dynamic, tasks that are

34

H. Rowaihy et al.

Algorithm 1. Exact location algorithm for event detection initialize each eij = eij , the detection probability of Si for Ti initialize each task cumulative detection probability uj ← 0 initialize number of assigned sensors to Tj , nj ← 0 For Task Leader (Tj ): advertise presence of Tj to each neighboring sensor Si for round = 0 to R do if nj ≤ N then among responding sensors G, choose i ← arg maxi {eij : Si ∈ G} update uj ← uj + eij send accept messages and advertise new uj else done For Sensor (Si ): wait for task requests among requesting tasks Q, choose j ← arg maxj {eij pj : Tj ∈ Q} send proposal to Tj including exact location if accepted then Si is assigned to Tj ; done else listen to current uj values for requesting tasks; if no more tasks then done update detection probability based on new uj ’s: eij ← 1 − (1 − uj )(1 − eij ) − uj repeat

not satisfied after the first assignment process will try to obtain more sensors once they learn there may be more available. This information can be obtained either from the base station or by overhearing the message announcing the end of a task. The remaining part of this section describes the algorithms that the task leaders use to determine which sensors should be assigned to each task. As both problems as defined above are NP-hard, our algorithms are heuristic based. 5.2 Exact Location Algorithms In this subsection we propose algorithms to solve the sensor-task assignment problems, for detection and localization, when the exact locations of sensors are known. Event Detection Tasks In order to conserve energy we limit the number of sensors that can be assigned to a task to N , which is an application parameter. A higher value of N may yield a higher cumulative detection probability for an individual task. Between tasks, however, there will also be greater contention for sensors. Due to the competition that can occur between tasks we propose an algorithm that runs in rounds to allow sensors to be assigned to their best match. When a task arrives to the network, the task leader advertises the presence of the task and its profit to nearby sensors. The ad message is propagated to ensure that all tasks that are within twice the sensing range (2Rs ) receive it. Since these tasks compete for the same sensors with the arriving task their leaders need to participate in the process.

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

35

In the first round, each leader informs the nearby sensors of the details of its task (location and profit). A sensor, which may hear ad messages from one or more tasks, proposes to its current best match. This is the task for which it provides the highest detection probability weighted by the task’s profit. More formally, Si proposes to task Tj that maximizes eij pj . From the set of proposing sensors, each task leader selects the sensor with the maximum detection probability and updates its current cumulative detection probability (CDP). In the next round, each leader sends out an update on the status of its task’s CDP after taking into account the currently assigned sensors. Sensors that were not selected in the first round recalculate eij , the amount by which they can increase the current CDP of the different remaining tasks (shown in the step before last in Algorithm 1). Again each unassigned sensor proposes to its best fit. This process continues for R rounds until all tasks have N assigned sensors or there are no more sensors available. R is an application parameter and should be set to be equal to at least N to give tasks a chance to assign enough sensors. Algorithm 1 summarizes the steps followed. Note that all the competing leaders will go through the steps shown for the task leader. During this process sensors can continue the detection for the tasks to which they were initially assigned. Change only happens if a sensor chooses a different task in which case it will be directed towards the new task’s location and start detecting events. When N = ∞ and full preemption is allowed, this algorithm provides a 2approximation guarantee which can be proved by adapting the proof of [18]. It is easy to construct examples showing that the guarantee is tight. Proposition 3. Algorithm 1 is a 2-approximation. Proof. For a given problem instance I, consider the first time that a sensor proposal is accepted by a task, say Tj accepting Si , with value p = vk (Si ) = pj eij . Let I be the problem instance after this assignment is chosen and other sensors’ proposal values for Tj and its remaining profit (that is, Tj ’s valuation vj (·)) are updated appropriately. Suppose Si is assigned to Tk in optimal solution OP T (I). Then we lower-bound the value of OP T (I ). One possible solution is the same as OP T (I) except with neither Tj nor Tk receiving Si . Tj ’s potential profit is reduced by exactly p. Since Si greedily proposed to Tj rather than to Tk , in the solution to instance I we must have profit from Tk reduced by at most p. Thus OP T (I) ≤ OP T (I ) + 2p. By applying the argument inductively, we obtain the result. Target Localization Tasks We propose a simple distributed solution (Algorithm 2) to the exact location localization problem. The goal in localization is to minimize the achieved uncertainty of the assigned sensor pair. Because localization tasks are sensitive to preemption, only nearby sensors that are not assigned to any other localization task propose to the leader with their exact location. If a sensor is assigned to a task that is less sensitive to preemption, such as detection in our case, it will also propose to the task. Among the proposing sensors, the leader chooses the pair of sensors that provides the lowest uncertainty according to Eq. 3. To do this each sensor must provide its angle from a predetermined axis. The angle of a sensor can be measured from the y-axis that passes at the estimated

36

H. Rowaihy et al.

Algorithm 2. Exact location algorithm for target localization For Task Leader (Tj ): advertise presence of Tj receive sensor proposals among responding sensors G √ d2 +d2 1 2 choose (i, k) ← arg mini,k {( | sin ) : (Si , Sk ) ∈ G2 } θ| send accept messages For Sensor (Si ): receive task request if (Si not assigned to localization task) then send proposal to Tj including the exact location else ignore request if accepted then Si is assigned to Tj ; done

target position. For a pair of sensors, the separating angle (θ) can then be determined by calculating the absolute difference between their respective sensor angles. A task’s number of neighboring sensors (of the needed type) will typically be limited and so considering all sensor pairs should be feasible. If there are many proposing sensors, the leader can set a distance threshold and ignore any sensors beyond this point. After making the assignment decision, the leader sends messages to the selected sensors. If they were previously assigned to other tasks, the leaders of those tasks are informed that they should search for replacements. 5.3 Fuzzy Location Algorithms In the previous subsection we proposed algorithms to assign sensors to tasks based on their exact locations. However, in some situations these schemes might not be feasible, either due to computational cost or due to location privacy concerns. In this subsection, we propose algorithms to assign sensors based only on their fuzzy locations. Instead of having the assignment algorithms to consider each sensor on its own, fuzzy location allows sensors to be classified into classes based on their fuzzy location. We consider the distance and angle requirements introduced in Section 3 to make the assignment based on different granularities. Recall that in this case sensors know their exact locations but do not disclose them. Event Detection - Fuzzy Distance In event detection, the probability that a sensor detects an event depends heavily on the distance between them. Here we define fuzzy distance based on different distance granularities as a measure of a sensor’s location. Clearly, only sensors within sensing range a task (location) should be considered. This area can be represented as a circle with radius Rs centered at the task location. If no distance granularity (DG) is specified (i.e. DG = 0), then all sensors within this circle are considered to be in the same class and equivalent. A solution based on DG = 0 will provide almost no guarantee on the solution quality. When DG is increased to 1, the distance from the target to the edge of the circle is divided to create two rings or annuli of equal area, which partitions the sensors into two classes. In Fig. 1(a) we see an example of fuzzy distance based on DG = 1.

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

A 2

A2

D

B

C

C

B

1 D

(a) Fuzzy Distance

A

(b) Fuzzy Angle

37

D2

A1 D1 B2 C2 B1 C1

C2

C1

B1 B2 D1 A1

D2

A2

(c) Fuzzy Distance and Angle

Fig. 1. Fuzzy Location

A sensor of class 1 will provide higher detection probability than a sensor of class 2. DG = 2 divides the circle into three equal-area rings, and so on. The algorithm used for detection is similar to Algorithm 1 above, with the change that sensors report back their classes rather than their exact detection probability. After the task leader sends out the task advertisement message, nearby sensors hear the message and classify themselves based on the distance granularity specified in the leader’s message. If a sensor is currently assigned to another task, it can decide, based on the preemption rules discussed above, to propose to the new task. The leader then chooses the best sensors for its task which in this case are the ones that lie within the closest rings. The detection probability of a sensor is determined based on the expected distance from a point in the ring in which the sensor lies to the center of the circle. This process not only provides location privacy by hiding the exact location of sensors but also reduces the computation time required to choose the assignments. The leader needs to consider only DG + 1 classes of sensors, instead of all individual sensors. Clearly, the higher the DG value, the better the selection becomes, which leads to a higher cumulative detection probability. On the other hand, a larger DG yields more sensor classes and sensor location privacy. The granularity level is a system parameter which we study below. We note that even if the exact distance from the sensor to the target is known, the task leader cannot accurately locate the sensor since it can be anywhere around a circle. Therefore, fuzzy distance is more protective of sensor privacy than is fuzzy angle, which we consider next. Target Localization - Fuzzy Angle To accurately localize a target, the task leader should not only pick sensors close to the target but also sensors with a separating angle as close as possible to 90◦ . This suggests another form of fuzzy location, based on the sensor’s viewing angle. To use fuzzy location, a sensor needs to determine its fuzzy angle. This is done based on the angle granularity (AG), which is indicated by the sector angle (given a circle centered at the estimated target location with radius Rs ). For example, when AG = 360◦ , all sensors within the circle are placed in the same class regardless of angle. If AG = 90◦ , then the circle is partitioned into four quadrants and the sensors are partitioned into four classes. When a sensor hears a task ad message, it determines its actual angle (using the y-axis that passes in the target estimated location as a reference) which then determines its sector. Note that since we only need to calculate | sin θ|, where θ is the separating angle, to determine the uncertainty of a sensor pair, sensors in opposite sectors are considered to be in the same class. Fig. 1(b) shows a circle divided into eight 45◦ sectors, i.e. AG = 45◦ .

38

H. Rowaihy et al.

Since the target localization uncertainty model depends on both the separating angle and the distance, the fuzzy location comprises both the fuzzy distance and fuzzy angle. After dividing the circle into sectors, we divide it into rings based on the distance granularity (see Fig. 1(c)). The algorithm used for localization in this case is similar to Algorithm 2 above. The difference is a sensor’s proposal now indicates its sector rather than its precise location. The leader runs the algorithm on all sensor classes (using the expected distance and expected angle for each) to determine the best pair of classes using Eq. 3. From each class a sensor is chosen arbitrarily. Note that with finer granularities, some sectors might be empty and hence some classes can be ignored. The number of sensor classes in this case is a function of both DG and AG. Assuming for simplicity that in each increment of granularity we divide AG by two, then the number of classes becomes (DG + 1)(180/AG). Note the special case when AG = 360◦ in which we will have one class. Also, note that AG = 180◦ is not used in our experiments as sensors from the two sectors will be equivalent. As with fuzzy distance, finer location granularity leads to more sensor classes and hence higher computational overhead. Also, with finer granularity the task leader gains more information about a sensor’s location, which decreases privacy.

6 Performance Evaluation In this section we discuss experiments evaluating our algorithms. We implemented a simulator in Java and tested our algorithms on randomly generated problem instances. We compare the results achieved by both the exact and fuzzy location algorithms. We also study the effect on detection quality of the maximum number of sensors a detection task allowed to receive. Finally, we analyze the effect of the location granularity level on the computational overhead and the sensors’ location privacy. 6.1 Simulation Setup There are two types of deployed sensors, directional acoustic sensors and imaging sensors, and two task types, detection and localization. The localization task can only utilize acoustic sensors, which must be assigned in pairs. Detection tasks can utilize both sensor types but to varying effect. Detection means that the beamformed output yields evidence of a target at a given bearing direction. The sensors need not be positioned to provide precise triangulation of the target. On the other hand, localization requires the sensors to be positioned so that the triangulation error for the target location is within given bounds as dictated by the utility function. The target location uncertainty of a pair of sensors to a task is found using Eq. 3. The detection probability with sensor Si assigned to task Tj is defined as follows:

SN R1 −1 eij = exp log(PF A ) 1 + (4) 2 Dij where Dij is the distance between the sensor and the task location, PF A is the false alarm probability (a user-chosen parameter), and SN R1 is the normalized signal-tonoise ratio at a distance of one meter from the source signal. This expression results

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

39

from analyzing a fluctuating source model embedded in AWGN when the square law detector is employed [4]. For computational and analytic convenience, we simply approximate eij as zero when Dij exceeds an effective sensing range of the sensor Rs = 40m. SN R1 was set to 60dB for acoustic sensors and to 66dB for imaging sensors. (Imaging sensors are assumed to have higher SN R due to their higher fidelity and zooming capabilities.) For both types, we set PF A = 0.001. These functions are only used for testing in our experiments and are not properties of our schemes; they are not meant to model the exact behavior of these two types of sensors. In our experiments, 30% of the sensors are imaging and 70% acoustic. Our goal is to maximize the achieved profits from all available tasks, i.e. max j pj uj where uj the the utility received by task Tj and pj is its profit. The utility achieved by a detection task is the cumulative detection probability (CDP), which is naturally in [0,1]. The utility that a pair of acoustic sensors provide to a localization task depends on the uncertainty level (Eq. 3). We normalize this value to [0,1] by treating acceptable uncertainty value (an application-specific parameter) as full utility. In our experiments we set this value to 16, which represents an error area of 4m in width. Any selected pair with uncertainty under 16 has 100% utility. Higher uncertainty means less utility; for example, uncertainty of 64 indicates 25% utility. We deploy 1000 nodes in uniformly random locations in a 400m × 400m field. The communication range of sensors is set to 40m. Tasks are created in uniformly random locations in the field. Localization tasks profits vary uniformly in [0.1,1]; detection task profits vary uniformly in [0,0.1], on average, an order of magnitude lower. We assume that these profits are awarded per unit of time for which a task is active. The maximum possible profit in time step is the sum of profits of all active tasks at that time step. Task lifetimes are uniformly distributed. Detection tasks, by their nature, last much longer than localization tasks, which are discrete computations typically prompted by particular detected events. Localization task lifetimes vary uniformly between 5 and 30 minutes, whereas detection task lifetimes vary uniformly between 1 and 5 hours. Tasks arrive based on a Poisson process, with an average arrival rate of 10 tasks/hour. Mirroring the sensor distribution, 30% of tasks are for localization and 70% are for detection. To test our algorithms, we compare their performance with an upper bound on the optimal solution quality. For each currently active task separately, we find optimal achievable profit for it, assuming there are no other tasks in the network, i.e. no competition. The sum of these values provides a (loose) upper bound. In our experiments, we show the average performance of the network for a period of 50 hours; we take the measurements at steady state after running the algorithms for 10 hours. Each point in the graph represents the average achieved profit per unit of time as a fraction of the maximum possible profit. The results are averaged over 20 runs. 6.2 Simulation Results Fig. 2 shows the average performance of the detection tasks. We limit the number of sensors that a task can have to 5 (i.e. N = 5). For Algorithm 1 we set the number of rounds R = N . We compare the results achieved by the exact and fuzzy location algorithms. We vary the distance granularity (DG) from 0 to 7 and observe its effect

0.9

0.8

0.7 Bound on Optimal Exact Location Fuzzy Location

0.6

0.9 0.8 0.7 0.6

Bound on Optimal Exact Location o Fuzzy - AG = 22.5 o Fuzzy - AG = 45o Fuzzy - AG = 90 o Fuzzy - AG = 360

0.5 0.4

0

1

2

3

4

5

Distance Granularity

Fig. 2. Detection

6

7

0

1

2

3

4

5

Distance Granularity

Fig. 3. Localization

6

7

Profits (frcation of max)

H. Rowaihy et al.

Profits (frcation of max)

Profits (frcation of max)

40

0.9 0.8 0.7 Bound on Optimal Exact Location o Fuzzy - AG = 22.5 o Fuzzy - AG = 45o Fuzzy - AG = 90 o Fuzzy - AG = 360

0.6 0.5 0

1

2

3

4

5

6

7

Distance Granularity

Fig. 4. Overall Performance

on the fuzzy location performance. The achieved profits initially increase rapidly as DG increases, but then slow once DG reaches 4. This suggests that the benefit gained from the increased granularity may not justify the loss in privacy and the increase in the computation cost. By the time DG reaches 7, the fuzzy location scheme performance is within less than 1% of the exact location scheme, which itself is near-optimal. Fig. 3 shows corresponding results for the localization tasks. We vary both DG and the angle granularity (AG). When AG = 360◦, i.e. when all sensors within range are placed in the same class regardless of angle, the performance is lowest, as expected. Achieved profits increase along with AG but this increase becomes negligible (less than 1%) once AG becomes finer than 22.5◦ . We note that the performance of the exact location scheme is within 6% of the optimal bound which is worse than the case of detection. This is mainly due to contention between tasks for the same sensing resources; localization is more sensitive to choice of sensors than is detection, since both distance and angle matter. Combining the results of both schemes (Fig. 4), we find that the total network profits are affected by both previous results. To study the communication overhead of our algorithms we assumed perfect communication channels. We found that the average number of messages exchanged for a detection task using exact location is 114. For fuzzy location we see a saving of about 10% in the number of exchanged messages due to the clustering of sensors which makes the algorithm converge faster. The algorithm for target localization requires one round (with possible reassignments) and uses only one type of sensors. Hence, its communication overhead is lower; the average number of messages exchanged is 62 messages/task. We also studied the effect of varying the task arrival rate. Cutting the arrival rate to 5 tasks/hour (and using the previous ratio of task types) leads to a 3% increase in profits for localization tasks using both exact and fuzzy location. This is attributed to lower competition. Detection tasks do not face considerable competition when the arrival rate is set to 10 tasks/hour and hence we do not see any change in performance when the arrival rate is lowered. Doubling the arrival rate to 20 tasks/hour increases the competition which leads to lower profits for both task types. Profits from detection tasks decrease by a negligible 1% whereas localization tasks witness a 4% decrease. Localization is affected more by changing the arrival rate as it is more sensitive to sensor choice. In Fig. 5, the performance of the detection algorithms is measured as N increases from 1 to 10. We use an arrival rate of 10 tasks/hour and fix DG = 5 and AG = 2. Note that a higher value of N means that more sensors can be assigned to each detection task, which will increase CDP. As expected, the profits of the detection tasks increase along with N . The behavior is similar for the exact solution and the upper bound on

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

0.7 0.6

Bound on Optimal Exact Location Fuzzy Location

0.5

o AG = 360 o AG = 90o AG = 45 o AG = 22.5

60 50

2

3

4

5

30 20

6

7

8

9

10

Fig. 5. Effect of Varying N

0.6 0.4 0.2

10

0 0

N

AG = 360 AG = 90oo AG = 45 o AG = 22.5

0.8

40

0 1

o

1

Privacy Factor

0.8

Number of Sensor Classes

Profits (frcation of max)

1 0.9

41

1

2

3

4

5

6

7

Distance Granularity

Fig. 6. Computational Cost

0

1

2

3

4

5

6

7

Distance Granularity

Fig. 7. Privacy

the optimal solution. The increase is rapid in the beginning but slows down due to the submodular nature of CDP. 6.3 Analysis of Computational Overhead and Privacy To analyze the computational overhead we plot (in Fig. 6) the number of sensor classes as the granularity of angle and distance becomes finer. When we increase the granularity the number of classes increases as well. The tradeoff between performance and efficiency depends on the number of sensors within sensing range of the task. In our experiments there are on average 31 sensors in that range. For a localization task, if we were to use DG = 3 and AG = 22.5◦ , we will end up with 32 classes which is greater than the expected number of sensors surrounding the task. For lower granularities, however, fuzzy location can lower computation cost. Also, in many cases the generated classes will be empty, meaning that fewer classes need be considered. Note that for tasks that only depend on distance, such as detection tasks, the computational savings can be significant. Fig. 7 shows a privacy metric for the different fuzzy location granularities. Let Ns be the number of nodes within sensing range of the task. We use the fraction of Ns lying in a sector to determine the privacy level that a certain fuzziness granularity provides. For example, if this fraction is 1, then a proposing sensor could be any of the Ns sensors, which provides the highest anonymity. If this fraction is 1/Ns the task leader can be almost certain of the identity of the proposing sensor, since it will often be alone in its sector. We see that although the privacy level stays relatively high when only distance granularity is increased, it decreases rapidly as we divide the circle surrounding the task location more finely. Note that the privacy level is also affected by the network density. The more sensors deployed, the higher the value of NS and hence the better the privacy.

7 Concluding Remarks Although in this paper we limited sensors to performing one task at a time, this limitation is not applicable to all domains. For some sensing data types, e.g. ambient temperature, a sensor may be able to serve many tasks at once. In a sensor network, there may, in fact, be sensors of both types. In this paper, however, we focused on the restricted type of sensor such as directional sensors since it is the more difficult problem. In future work, we will consider settings in which sensors of both types are present.

42

H. Rowaihy et al.

In terms of location privacy, we note that with repeated requests by tasks in the surrounding area of a sensor, an entity can gain more precise information about the sensor’s location. This can be learned by considering the intersections of the circles with radius Rs around each task’s location. We intend to study such issues in the future. Acknowledgment. This research was sponsored by the U.S. Army Research Laboratory and the U.K. Ministry of Defence and was accomplished under Agreement Number W911NF- 06-3-0001. The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Army Research Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K. Government. The U.S. and U.K. Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

References 1. Ahmed, N., Kanhere, S.S., Jha, S.: Probabilistic coverage in wireless sensor networks. In: LCN 2005, Washington, DC, USA (2005) 2. Ai, J., Abouzeid, A.: Coverage by directional sensors in randomly deployed wireless sensor networks. Journal of Combinatorial Optimization 11(1), 21–41 (2006) 3. Bar-Noy, A., Brown, T., Johnson, M.P., La Porta, T., Liu, O., Rowaihy, H.: Assigning sensors to missions with demands. In: Kutyłowski, M., Cicho´n, J., Kubiak, P. (eds.) ALGOSENSORS 2007. LNCS, vol. 4837, Springer, Heidelberg (2008) 4. Blackman, S., Popoli, R.: Design and Analysis of Modern Tracking Systems (1999) 5. Bose, P., Morin, P., Stojmenovic, I., Urrutia, J.: Routing with guaranteed delivery in ad hoc wireless networks. Wireless Networks 7(6), 609–616 (2001) 6. Cai, Y., Lou, W., Li, M., Li, X.: Target-Oriented scheduling in directional sensor networks. In: INFOCOM 2007 (2007) 7. Fotakis, D., Spirakis, P.G.: Minimum congestion redundant assignments to tolerate random faults. Algorithmica 32(3), 396–422 (2002) 8. Frank, C., Omer, K.: Algorithms for generic role assignment in wireless sensor networks. In: SenSys 2005 (2005) 9. Gerkey, B.P., Mataric, M.J.: A formal analysis and taxonomy of task allocation in MultiRobot systems. The International Journal of Robotics Research 23(9), 939 (2004) 10. Hefeeda, M., Ahmadi, H.: A probabilistic coverage protocol for wireless sensor networks. In: ICNP 2007, pp. 41–50 (2007) 11. Johnson, M.P., Rowaihy, H., Pizzocaro, D., Bar-Noy, A., Chalmers, S., La Porta, T., Preece, A.: Frugal sensor assignment. In: Nikoletseas, S.E., Chlebus, B.S., Johnson, D.B., Krishnamachari, B. (eds.) DCOSS 2008. LNCS, vol. 5067, pp. 219–236. Springer, Heidelberg (2008) 12. Kadar, I.: Optimum geometry selection for sensor fusion. In: SPIE 1998 (1998) 13. Kaplan, L.: Local node selection for localization in a distributed sensor network. IEEE Transactions on Aerospace and Electronic Systems 42(1), 136–146 (2006) 14. Kaplan, L.M., Le, Q.: On exploiting propagation delays for passive target localization using bearings-only measurements. J. of the Franklin Institute 342(2), 193–211 (2005) 15. Karp, B., Kung, H.: Greedy perimeter stateless routing for wireless networks. In: MOBICOM 2000 (2000) 16. Kelly, A.: Precision dilution in triangulation-based mobile robot position estimation. In: Proceedings of Intelligent Autonomous Systems, Amsterdam (2003)

Detection and Localization Sensor Assignment with Exact and Fuzzy Locations

43

17. Lam, C.-C., Sadayappan, P., Wenger, R.: Optimal reordering and mapping of a class of nested-loops for parallel execution. In: Sehr, D., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1996. LNCS, vol. 1239. Springer, Heidelberg (1997) 18. Lehmann, B., Lehmann, D.J., Nisan, N.: Combinatorial auctions with decreasing marginal utilities. In: EC 2001 (2001) 19. Low, K.H., Leow, W.K., Ang Jr., M.H.: Autonomic mobile sensor network with selfcoordinated task allocation and execution. IEEE Trans. on Systems, Man and Cybernetics (C) 36(3), 315–327 (2006) 20. Mehta, K., Liu, D., Wright, M.: Location privacy in sensor networks against a global eavesdropper. In: ICNP 2007 (2007) 21. Preece, A., Gomez, M., de Mel, G., Vasconcelos, W., Sleeman, D., Colley, S., Pearson, G., Pham, T., La Porta, T.: Matching sensors to missions using a knowledge-based approach. In: SPIE DSS 2008 (2008) 22. Rao, A., Ratnasamy, S., Papadimitriou, C., Shenker, S., Stoica, I.: Geographic routing without location information. In: MOBICOM 2003 (2003) 23. Roughan, M., Arnold, J.: Multiple target localisation in sensor networks with location privacy. In: Stajano, F., Meadows, C., Capkun, S., Moore, T. (eds.) ESAS 2007. LNCS, vol. 4572, pp. 116–128. Springer, Heidelberg (2007) 24. Tutton, S.J.: Optimizing the allocation of sensor assets for the unit of action. Technical report, Naval Postgraduate School, California (2006) 25. Wang, H., Yao, K., Pottie, G., Estrin, D.: Entropy-based sensor selection heuristic for target localization. In: IPSN 2004, Berkeley, California, USA (2004)

Minimum Variance Energy Allocation for a Solar-Powered Sensor System Dong Kun Noh, Lili Wang, Yong Yang, Hieu Khac Le, and Tarek Abdelzaher Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801, USA {dnoh,liliwang,yang25,hieule2,zaher}@illinois.edu

Abstract. Using solar power in wireless sensor networks (WSNs) requires adaptation to a highly varying energy supply. From an application’s perspective, however, it is often preferred to operate at a constant quality level as opposed to changing application behavior frequently. Reconciling the varying supply with the ﬁxed demand requires good tools for predicting supply such that its average is computed and demand is ﬁxed accordingly. In this paper, we describe a probabilistic observation-based model for harvested solar energy, which accounts for both long-term tendencies and temporary environmental conditions. Based on this model, we develop a time-slot-based energy allocation scheme to use the periodically harvested solar energy optimally, while minimizing the variance in energy allocation. Our algorithm is tested on both outdoor and indoor testbeds, demonstrating the eﬃcacy of the approach.

1

Introduction

Nodes in wireless sensor networks (WSNs) usually run on batteries. For applications in which a system is expected to operate for long periods, energy becomes a severe constraint. A lot of eﬀort was spent on developing techniques to make more eﬃcient use of limited energy. Recently, environmental energy emerged as a feasible supplement to battery power for wireless sensor systems where manual recharging or replacement of batteries is not practical. Solar energy is becoming widely used due to its high power density compared to other sources of renewable energy. Solar energy has the following two special properties: – It is periodic: The sun rises and sets once a day. Therefore, this is the duration of a charging cycle, which we call a harvesting period. A new supply of solar energy can be expected during every harvesting period. – It is dynamic: Solar energy varies throughout the day. Commonly, it increases in the morning, decreases in the afternoon, and stays nearly zero during the night. Additionally, it changes from day to day depending on the weather or season. An optimal energy allocation scheme for a solar-powered sensor node should satisfy the following requirements: B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 44–57, 2009. c Springer-Verlag Berlin Heidelberg 2009

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

45

– Energy-neutral operation(ENO): The energy input during a harvesting period should not be less than the amount consumed during the same period. Since the energy harvested in one day can vary with environmental conditions, a node should adapt its power consumption rate to this harvested energy. – Minimizing the waste of harvested energy: Solar energy can be harvested periodically. However, unused residual energy in a battery may stop this harvested energy from being stored, due to the restrictions on battery capacity. Therefore, it is more important to make the best use of harvested energy than to minimize the energy consumed. – Minimizing variations in the allocated energy: There is no energy to be harvested when the sun is down. However, in many applications, the user wants to collect data at the same rate at all times. Therefore, a node should reserve an adequate amount of energy to operate at a constant level at all times. This requirement is often overlooked in previous adaptive energy allocation schemes and is the main contribution of our new energy allocation algorithm. In this paper, we present an energy allocation algorithm to achieve optimal use of harvested energy, which meets all of the above requirements. The rest of this paper is organized as follows. In the next section we analyze existing schemes for optimizing energy allocation in solar-powered WSNs. Section 3 describes our expectation model for the solar energy harvest, and also explains a new energy allocation algorithm for optimally using the harvested solar energy. We then give an overview of our experimental testbed in Section 4, and evaluate the performance of our schemes in Section 5. Finally, conclusions are drawn in Section 6.

2

Related Work

In recent years, researchers have become interested in applying solar energy to WSNs. Corke et al. [1] presented the hardware design principles for long-term solar-powered wireless sensor networks. Minami et al. [2] designed a batteryless wireless sensor system for environmental monitoring, which is called SolarBiscuit. Simjee and Chou [3] presented a super capacitor-operated, solar-powered wireless sensor node called Everlast. Jay et al. [4] described a systematic approach to building micro-solar power subsystems for wireless sensor nodes. However, most research, including the work mentioned above, has focused only on nodelevel design such as hardware architecture or system analysis. From a view-point of energy optimization, the major concern of traditional wireless sensor networks is to minimize the energy consumption. Some of them [5,6,7] focus on the routing scheme, which considers residual energy to determine the optimal path. Zhao et al. [8] proposed a scheme to balance the quantity of residual battery on each node for a distributed system, and Mini et al. [9] suggested the way of expecting the future energy consumption of each node. These approaches, however, are based on the limited energy on the battery, and do not consider charging battery from another energy source.

46

D.K. Noh et al.

An early approach to the utilization of environmental energy for energy-aware routing [10] demonstrated that environmentally aware decisions improve performance compared to decisions based only on battery status, although the application scenario was limited. A solar-aware version of directed diﬀusion has also been proposed [11]. In this scheme, gradients are used to provide additional information about the state of neighbors, which can help solar-energy-aware path selection. More recently, some researchers have taken harvested energy into consideration in controlling the duty cycle to sustain performance levels. Kansal et al. [12] suggested a mathematical result which would express the conditions under which a sensor node can operate perpetually, through an analysis of the relationship between harvested and consumed energy. Similarly, Vigorito et al. [13] proposed controlling the adaptive duty-cycle with an algorithm based on adaptive control theory [14]. Their main contribution is that they present a model-free approach to this problem, which can therefore be applied to other forms of environmental energy. Our previous work [15, 16] uses the solar energy for maximizing of retrievable data by adaptively controlling data reliability. SolarStore [15] adapts the degree of data replication dynamically considering both the battery status(capacity and residual energy) and the storage status (capacity and available space). AdaptSens [16] provides a layered architecture to support an adaptive reliability depending on the battery status (residual energy). It provides a set of layers which is incrementally activated according to the residual energy level. Both of them focus on how to achieve the best reliability and the most retrievable data by adapting their performance level to the battery(storage) status. In contrast, this paper focuses on how to use the harvested solar energy maximally and how to allocate it fairly over time, based on the estimation scheme for solar energy harvesting.

3

Optimal Energy Allocation

In this section, we present an energy allocation algorithm that is novel in that it reconciles maximization of harvested energy with minimization of variations in allocated energy. The respective subsections below (i) formulate the optimal energy allocation problem, (ii) describe our algorithm for solar energy estimation, (iii) present optimal energy allocation, then (iv) summarize additional considerations regarding initial battery status. 3.1

Problem Formulation

In our energy system model, we assume that there is an energy buﬀer such as a battery between the solar cell and the sensor node. This buﬀer helps the system use the harvested energy more eﬃciently by storing energy temporarily. Assume that the period at which energy is harvested is T (which is 24 hours for solar energy), and that T is divided into sub-blocks (which we will call slots from now on) of an equal duration L. The size of L will depend on the system resources and the application requirements. Energy can then be allocated to each slot.

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

47

j j Let Ehrv be the amount of energy harvested during slot j, and let Ealloc be the amount of energy allocated to slot j. The amount of energy remaining in the battery k slots after t0 can be calculated as follows:

Ebtr (t0 +k·L) = Ebtr (t0 )+

k−1 j=0

j j Ehrv −Ealloc ,

(1)

where Ebtr (t) is the amount of residual energy in the battery at time t, and t0 is the start time of slot 0. Since the current battery status is totally dependent on the amount of energy harvested and consumed during previous slots, as shown in Equation (1), the energy allocation problem can be formulated as a linear program. The objective function of this program should reﬂect the requirements of optimal energy allocation: the energy harvested during T should be fully utilized while meeting the ENO condition, and the variation of the allocated energy between slots should be minimized. An optimal allocation of energy can thus be obtained by solving the following linear programming problem: N −1 0 Find an N-dimensional vector, Ealloc=< Ealloc , ..., Ealloc >, to minimize λ: k λ = V ar{Ealloc }0≤k

Fig. 3. The best energy allocation for slot i at the current time t0 + i·L

by using Equation (5). Based on this result, we want to estimate the optimal allocated energy (Ealloc ) for the future N−i slots from the next slot (slot i) to the ﬁnal slot of the harvesting period (slot N−1). To do so, the following linear programming problem needs to be solved: N −1 i Find an (N-i)-dimensional vector, Ealloc= , to minimize λ: k λ = V ar{Ealloc }i≤k. We choose i Ealloc of this vector as the energy budget for the next slot i and allocate this amount of energy to slot i. Note that, this result is only the best solution at the current time t. Conditions change from time to time, requiring recalculation at the start of every slot with updated expectation of harvested energy. Therefore, the optimal energy budget for each slot is determined just before the start of the slot. To sum up, at the start of every slot i (0 ≤ i < N ), the system invokes an energy allocation algorithm. This algorithm solves the problem of Equation (6), based on the expectation of harvested energy explained in Equation (5), and i determining Ealloc for the next slot i. The complexity of this algorithm decreases as we approach the end of the harvesting period T , since the dimension of vector which we have to ﬁnd by solving Equation (6) declines.

3.4

Consideration of the Initial Battery Status

Since the allocation of energy by our algorithm takes account of the expected battery status in future slots, it is aﬀected by the initial battery status. In order for our algorithm to get the best performance, the initial battery status should satisfy some constraints. There are two possible cases:

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

51

– Case 1 - Shortage of battery capacity: the battery is initially almost full and our algorithm starts at daytime when a lot of harvested energy is expected during future slots. Since the battery does not have enough space to store this harvested energy, it is lost. – Case 2 - Shortage of residual energy: the battery is initially almost empty and our algorithm starts at night when there will be no energy harvested during upcoming slots. Since there will be no energy in the battery during these future slots, the energy allocated to these slots must be zero, and the energy harvested during the daytime is distributed to the remainder of the slots. This leads to a highly variable allocation of energy to each slot. Let is be the set of slots during which the energy harvest is larger than the energy allocation. In order to avoid the shortage of battery capacity that characterizes Case 1, the capacity for energy should be larger than the following energy surplus: i i Esurplus = (Ehrv −Ealloc ). (7) ∀i∈is

Similarly, if this amount of energy is supposed to be stored in the battery, the problem of a shortage of residual energy in the battery, which we see in Case 2, can also be eliminated. Therefore, if the amount of energy initially stored in the battery and the amount of space initially available in the battery are both greater than Esurplus , our algorithm can obtain the best energy allocation regardless of the time. However, the problem is that we cannot know the exact amounts of energy i i harvested (Ehrv ) and allocated (Ealloc ) during each of the slots. Therefore, historical information is used again to predict these two quantities. As explained in i Section 3.2, the base expectation of energy harvested during each slot (E hrv ) is maintained by the moving-average algorithm of Equation (3), and the expected i energy allocation (E alloc ) is supposed to be maintained in the same way. Howi i ever, when we use expected quantities such as E hrv and E alloc in Equation (7) to determine the initial battery status, there is inevitably a small error (). The value of should be determined according by experimental observations, since it depends on the environmental conditions in the region where the node is deployed. To sum up, we need Esurplus + of both residual energy and available capacity on the initial battery to obtain the most satisfactory results.

4

Experimental Testbed

We have set up both indoor and outdoor testbeds for solar-powered sensor networks to evaluate the performance of our algorithms. The outdoor testbed provides a realistic environment, but it is diﬃcult to conduct a fair comparison of our algorithms with other schemes, because experimental results on the outdoor testbed greatly depend on unrepeatable environmental conditions, such as weather and season. To avoid this problem, an indoor testbed has been designed.

52

D.K. Noh et al.

Measuring component

Info. / Control Power

Info. / Control Power

- Current sensor - Voltage sensor

- EeePC - Router - Sensor

- EeePC - Router - Sensor

Recharging battery

Solar panels

Consumed energy (Current sensor)

(2x105W)

Solar-panel emulator

Residual battery

(12V 98Ah)

Load controller

Sensing system

Power on/off controller

Charge controller

Solar energy harvesting system

(a) Outdoor node

(X10)

Sensing system

So lar en erg y h arv estingemu lato r

(b) Indoor node

Fig. 4. Architecture of a node in the indoor and outdoor testbeds

It has the same systems as the outdoor testbed, except for the solar harvesting components. This indoor testbed allows us to evaluate the performance of diﬀerent algorithms with the same solar energy input, which can mimic a wide range of environmental conditions. A description of both testbeds appears in [15, 16]. It is brieﬂy summarized below for convenience. 4.1

Outdoor Testbed

We have developed a solar-powered sensor system and deployed nine nodes on a south farm of the University of Illinois at Urbana-Champaign. Fig. 4(a) shows the the overall architecture of our outdoor testbed node. As shown in this ﬁgure, it consists of two components: an energy harvesting system and a sensor system. Firstly, the energy harvesting system uses one 12V 98Ah deep-cycle battery. Two 105W solar panels are used to harvest the solar energy. However, this 105W output of a solar panel is the theoretically maximum output under the ideal condition, thus the practical output could be much smaller than that. The charge controller is deployed between the panels and battery, in order to prevent the overcharge and other electronic damage of the battery by regulate the output current and voltage of the solar panel to meet the requirements of the battery. Similarly, the load controller is deployed to prevent the over-discharge of the battery. Additionally, to give energy-related information to the system, the voltage and current sensors are attached on the battery. The sensor system uses a low-end laptop (Asus EEE PC) as its computing component since the testbed is intended to support a wide spectrum of applications, including multimedia sensing applications which require high-performance sensing and processing. The laptop consumes only about 10∼15W, depending on the quantity of the load. Meanwhile, we chose the Linksys WRT54 GL router as a wireless interface, which has a low power consumption (6W) and a detachable antenna. 4.2

Indoor Testbed

The sensor system of each indoor node is a clone of that in the outdoor node. However, the energy harvesting system of an outdoor node should be replaced

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

53

with the software program. Therefore, we designed an solar energy harvesting emulator as shown in Fig. 4(b). Initially, the charging current from the solar panels is determined by the solar panel emulator. In our indoor testbed, the solar panel emulator is operating based on the information of charging current which is collected from the outdoor experiment. Meanwhile, the discharge rate of the sensor system can be read from the current sensor on each node. Then, by using these charging and discharging rates, the battery emulator updates its quantity of residual energy periodically. We also mimic the load controller by using X10 [17] modules which are typically used to control ‘smart home’ appliances. This X10 module disconnects and reconnects the AC power to a node, depending on the level of residual energy in the emulated battery.

5 5.1

Performance Evaluation Experimental Setup

We ran each indoor experiment for a period of 15 days using traces collected outdoors for the period from Aug. 1st to Aug. 15th, during which the average harvested energy was 38.8Ah. To prove that our algorithm can work well even with the harsh constraint of battery capacity, we conﬁgured the emulated battery at each indoor node to have a capacity of 30Ah. The capacity of the real battery is 98Ah, but the load controller disconnects the circuit when the residual energy in the battery goes below a certain point, to protect the battery from over discharging. By controlling this disconnection point, we make the battery operate like 30Ah-battery. Additionally, we assign a diﬀerent initial energy level of the battery to each node, evenly distributed between 30% to 70%. The duration of a slot is one hour and the ﬁrst slot starts at 7:00 a.m.. Lastly, we set the maximum max amount of energy which can be allocated to each slot (Ealloc in Equation (6)) to 2Ah. During the experiments, the application on a node is always recording acoustic data, unless that node is in sleep mode. The forgetting factor θ in Equation (3), which is used to maintain E hrv at each slot, is set to 0.5. 5.2

Experimental Results

To verify the accuracy of our model for energy harvesting, we measured the diﬀerence between the expectation for energy harvest and the actual amount of energy harvest. Fig. 5(a) shows the average values of this diﬀerence during a harvesting period. As shown in this table, the amount of energy harvest expected by our reﬁned scheme is closer to the actual harvested energy than the base expectation. Fig. 5(b) compares the amount of actual energy harvest and the expectation, during both Aug. 5th and Aug. 7th. Based on the results of Fig. 5, we can infer that the error of the base expectation increases when the condition of that day is unusual such as that of two days in Fig. 5(b) (Aug. 5th and Aug. 8th were an extremely hot

D.K. Noh et al.

Amount of the energy (Ah)

54

Refined expectation Base expectation for energy harvest for energy harvest Average error

1.192 Ah

8.852 Ah

Actual energy harvest Refined expectation for energy harvest Base expectation for energy harvest

50 45 40 35 30 25 20 15 10 5 0

Aug. 5th (Sunny day)

(a) Average error of the expectation

Aug. 7th (Rainy day)

(b) Error of the expectation in two unusual days

Fig. 5. Diﬀerence between the expectation and the actual harvested energy for a day

2.5

35 30 25

Amount of the energy (Ah)

Amount of the energy (Ah)

2

1.5

1

0.5

0

20 15 10 5 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Slot

(a) Energy allocation using various algorithms

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Slot

(b) Trace of the residual energy in the battery

Fig. 6. Energy allocation and battery status on Aug. 7th (rainy day) in node 9

day and a heavy rain day respectively), while it never happened in our reﬁned expectation scheme. This result shows that our expectation scheme works well even at a place where the weather changes frequently. We next conducted some experiments for the test of our energy allocation scheme. We select the following energy allocation schemes for a comparative analysis: (a) ideal scheme - this determines the amount of energy allocated to all slots at the beginning of the harvested period by solving Equation (2), with assumption that the amount of energy harvest is known a priori; (b) naive scheme - this is the same as ideal scheme, except that it solves Equation (2) with the base expectation (Kansal’s duty cycle control algorithm in [12] is based on this scheme); (c) greedy scheme - this scheme allocates the energy to the next slot as much as possible, invoked at the beginning of every slot (our previous works [15,16] is based on this scheme); and (d) our scheme with the base expectation for energy harvest - this scheme uses our energy allocation algorithm, hrv ) but it uses the base expectation (E hrv ) instead of our reﬁned expectation (E for the amount of energy harvest. Fig. 6(a) shows the amount of energy allocated to each slot by various energy allocation schemes, on Aug. 7th at node 9. The naive scheme makes the evenly distributed allocated energy, but the total amount of energy allocated to each slot is much more than the actual energy harvest, due to the error of the expected

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

Actual energy harvest

Expectation for energy harvest

Allocated energy

Residual battery

7

35

Aug. 7th (Rainy day)

Aug. 8th (Sunny day)

6

30

5

25

4 20 3 15 2 10

1

Residual battery (Ah)

Amount of the energy (Ah)

55

5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 -1

0

Slot

Fig. 7. Adaptive energy allocation of our algorithm in node 9 during two days

Naive energy allocation scheme with the base expectation for energy harvest Greedy allocation scheme Our energy allocation scheme with the base expectation for energy harvest Our energy allocation scheme with the refined expectation for energy harvest

70 60 50 40 30 20 10 0 Average duty cycle on each slot (with average load)

Variation of the duty cycle on each slot (with average load)

Non-working slot (%)

Fig. 8. Comparison of the performances related to the duty cycle

energy harvest. Thus, the residual energy of the ﬁnal slot gets much smaller than that of the starting slot, as shown in Fig. 6(b). This means the node violates the ENO condition, so it cannot be expected to operate stably and continuously. The energy allocated by the greedy scheme has the relatively high deviation, since it tries to allocate the energy as much as possible to the next slot without taking the balance into consideration. Moreover, the amount of energy remaining in the battery is zero after slot 18 as shown in Fig. 6(b), so the node has to stay in sleep mode after that time until starting to harvest energy in the next harvest period. Our scheme with the base expectation (E hrv ) makes the energy allocation with a higher deviation than the result of our scheme with the reﬁned expectation for hrv ). This is due to the error in expectation for energy harvest. energy harvest (E However, this scheme accounts for the error in a previous slot to determine the energy allocated in the next slot, so the allocated energy gets closer to the ideal allocation as time goes. Lastly, our scheme with the reﬁned expectation of energy hrv ) shows the best result which is closest to the ideal allocation, while harvest (E satisfying the ENO condition. Fig. 7 shows the trace of several important values measured during two days in node 9. As shown in this ﬁgure, our expectation for energy harvest nearly catches up with the actual energy harvest, taking account of temporal conditions such as

56

D.K. Noh et al.

weather. In addition, the energy is allocated to each slot adaptively according to the energy harvest and the residual battery, while keeping both the low variance of energy allocated to each slot and the ENO condition. We can check that our algorithm allocates the maximum amount of energy to many slots from the starting slot on Aug. 8th (sunny day), in order to maximally use the harvested energy. In other words, since the battery does not have enough space to store all of the surplus energy on that day, the node can minimize the wasted energy by allocating energy to each slot as much as possible before the battery gets full. Finally, we measured the duty cycle of all 9 nodes for 15 days. The performances related to the duty cycle are shown in Fig. 8. Our energy allocation algorithm with our reﬁned expectation of energy harvest shows the best performance in all aspects of the duty cycle such as the average and the variance. Especially, the number of slots in which the node stays entirely in sleep mode is zero with our energy allocation scheme.

6

Conclusion

In this paper, we have put forward an appropriate model of solar energy harvested is solar-powered sensor networks, and also developed a practical algorithm to allocate the harvested energy optimally at each time-slot, based on this model. The main advantage of the new algorithm is that it minimizes the variation in allocated energy over time, leading to a more stable application performance, while at the same time maximizing the amount of energy harvested.

Acknowledgement This work was funded in part by NSF grants CNS 06-26342, CNS 05-53420, and CNS 05-54759.

References 1. Corke, P., Valencia, P., Sikka, P., Wark, T., Overs, L.: Long-duration solar-powered wireless sensor networks. In: EmNets (2007) 2. Minami, M., Morito, T., Morikawa, H., Aoyama, T.: Solar Biscuit: a battery-less wireless sensor network system for environmental monitoring applications. In: INSS (2005) 3. Simjee, F., Chou, P.H.: Everlast: Longlife, supercapacitoroperated wireless sensor node. In: ISLPED (2006) 4. Taneja, J., Jeong, J., Culler, D.: Design, modeling and capacity planning for microsolar power sensor networks. In: IPSN (2008) 5. Maleki, M., Dantu, K., Pedram, M.: Lifetime prediction routing in mobile ad hoc networks. In: WCNC (2003) 6. Shah, R.C., Rabaey, J.M.: Energy aware routing for low energy ad hoc sensor networks. In: WCNC (2002) 7. Younis, M., Youssef, M., Arisha, K.: Energy-aware routing in cluster-based sensor networks. In: MASCOT (2002)

Minimum Variance Energy Allocation for a Solar-Powered Sensor System

57

8. Zhao, J., Govindan, R., Estrin, D.: Residual energy scans for monitoring wireless sensor networks. In: WCNC (2002) 9. Mini, R.A.F., Nath, B., Loureiro, A.A.F.: A probabilistic approach to predict the energy consumption in wireless sensor networks. In: IV Workshop de Comunicao sem Fio e Computao Mvel. Sas Paulo (2002) 10. Kansal, A., Srivastava, M.B.: An environmental energy harvesting framework for sensor networks. In: ISLPED (2003) 11. Voigt, T., Ritter, H., Schiller, J.: Utilizing solar power in wireless sensor networks. In: LCN (2003) 12. Kansal, A., Hsu, J., Zahedi, S., Srivastava, M.B.: Power management in energy harvesting sensor networks. ACM Transactions on Embedded Computing Systems 6(4), 1–38 (2007) 13. Vigorito, C.M., Ganesan, D., Barto, A.G.: Adaptive control of duty cycling in energy-harvesting wireless sensor networks. In: SECON (2007) 14. Kumar, P., Varaiya, P.: Stochastic Systems: Estimation, Identiﬁcation and Adaptive Control. Prentice-Hall, Inc., Englewood Cliﬀs (1986) 15. Yang, Y., Wang, L., Noh, D.K., Le, H.K., Abdelzaher, T.: SolarStore: Enhancing data reliability in solar-powered storage-centric sensor networks. In: MobiSys (2009) 16. Wang, L., Noh, D.K., Yang, Y., Le, H.K., Abdelzaher, T.: AdaptSens: An adaptive data collection and storage service for solar-powered sensor networks. In: SECON 2009 (in submission) (2009), http://www.cs.uiuc.edu/homes/dnoh/Lili09.pdf 17. X10: Smart Home Controller, http://www.x10.com/

Optimal Rate Allocation of Compressed Data Streams in Multihop Sensor Networks Chun Lung Lin and Jia Shung Wang Department of Computer Science, National Tsing Hua Univeristy, Hsinchu, Taiwan [email protected]

Abstract. Consider that a transform coder is installed at each sensor node. In this paper, an analytical framework for optimal rate allocation of compressed data streams in wireless multihop sensor networks is presented. Compact, necessary and suﬃcient closed-form solution to the optimal rate allocation problem is derived based on the lower rate approximation of rate distortion function. Extensive simulations were conducted using real-world data traces, LEM and Intel Lab datasets. The simulation results show that the information of compression ratios of sensor data can be utilized to optimally distributing the limited transmission rate among sensors to greatly reduce the amount of transmitted data. Moreover, the performance gain of the optimal rate allocation increases exponentially at lower rates, lower compression ratios of sensor data and larger variation between compression ratios. Keywords: Wireless multihop sensor networks, optimal rate allocation, rate-distortion function, range query.

1

Introduction

A typical wireless sensor network (WSN) is formed by a large number of small sensors together with several internal gateways and a data collector, referred to as the sink node [1]. These sensors can be deployed to collect physical information and monitoring events in the regions where traditional wired sensor networks are impossible. WSN technology has many current and envisioned applications, such as surveillance, weather, earthquake monitoring [12], environmental monitoring [11], military vehicle surveillance, health care and so on. Many of current research in WSNs have been devoted to reduce the amount of the transferred data from sensors to the sink because that transmitting fullresolution data sets is prohibitive due to (i) greatly increased power consumption in the wireless communication and (ii) limited bandwidth. In some applications, aggregation has been applied to lessen the total volume of the transmitted data [6], [8], [16]. The aggregation techniques work by summarizing the data coming from multiple source nodes in the form of simple statistics, such as average, maximum, minimum or median. However, the aggregation techniques only provide rather crude data resolution, and fail to meet applications that require more B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 58–71, 2009. c Springer-Verlag Berlin Heidelberg 2009

ORA of Compressed Data Streams in Multihop Sensor Networks

59

Fig. 1. The rate optimization framework for a wireless sensor network

ﬂexible data quality. As an alternative, compression is a less intrusive form of data reduction which can provide higher ﬂexibility of data ﬁdelity [2], [7], [9]. The compression techniques represent the source data more concisely by exploiting statistical redundancy (temporal and spatial redundancy), and the trade-oﬀ thus arises between the size of compressed data and its precision. In this paper, we address the rate allocation (distribution) problem in a wireless multihop sensor network. Consider that a transform coder is installed at each sensor node. The Optimal Rate Allocation (ORA) problem is: Given a target distortion Dt , ﬁnd an optimal rate allocation (distribution) among sensors such that the total transmission rate is minimized and the overall distortion (error) of the collected data meets Dt . The objective of the rate allocation problem is to construct an optimized rate distortion function (referred to as O-RDF) of the whole network system, as shown in Fig. 1. Consider a hierarchical multihop sensor network in which source data is compressed layer by layer, i.e., data compression is allowed at internal nodes of the network (see Fig. 1). The basic assumption of the ORA problem is that the compression ratios of these nodes is distinct from each other depending on the degree of temporal and spatial redundancies. As such, if some loss of data quality is allowed at each node, then solving the ORA problem will involve two nontrivial issues. Firstly, it is necessary to construct the rate distortion functions of these nodes (including external and internal nodes). Usually, the rate distortion functions of source sensor nodes (referred to as T-RDF) can be empirically estimated by using input source data. On the other hand, estimating the rate distortion functions of internal nodes (referred to as S-RDF) is more complicated and usually involves higher computation and communication cost. In this paper, we will present a linear S-RDF construction method and show via extensive simulations that the S-RDF of an internal node can be eﬃciently estimated by using the T-RDF’s of its child nodes plus a spatial compression coeﬃcient. Secondly, since some loss of quality is

60

C.L. Lin and J.S. Wang

allowed at every node in a data routing path, an accurate distortion propagation model is needed to be devised in order to estimate the overall distortion of the reconstructed data for each feasible rate allocation. The contributions of this paper are summarized as follows: • This paper presents an architecturally appealing optimization framework, as shown in Fig. 1, that addresses the rate allocation problem at the applicationlayer. The objective of the framework is to provide a fundamental theoretical study on the rate allocation problem as well as the performance gain of the optimal rate allocation over the uniform allocation. • An accurate rate distortion model is employed to construct the T-RDF’s of sensors (Part A). By the assumption that the S-RDF of an internal node can be well represented by using T-RDF’s of its child nodes, a linear regression-based method is proposed to construct S-RDF’s layer by layer (Part B). A distortion propagation model is presented based on the assumption of the optimality for the scalar quantizer of transform coding techniques. With the help of T-RDF, SRDF and the distortion propagation model, a compact, necessary and suﬃcient closed-formed formula for the optimal rate allocation in wireless multihop sensor networks are derived (Part C). • Extensive simulations were conducted to evaluate the performance gain of the optimal allocation method. The accuracy of the proposed S-RDF construction technique is also evaluated. The simulation results show that the optimal allocation signiﬁcantly reduces the amount of transmission data as compared with the uniform allocation. The performance gain of the optimal allocation exponentially increases at lower rates, lower compression ratios of source data, and larger variation between compression ratios of source data. There are several applications that can beneﬁt from our techniques. A prominent example involves habitat, climate, environmental [11], [12] and other scientiﬁc monitoring applications [15]. These applications are intended for gathering data records for detecting changing data trends, changing behavioral patterns or building models. Our techniques can be useful for gathering minimum amount of sensor data according to a target distortion (rate) criterion. Our techniques are also useful for providing multi-resolution data collection schemes, like DIMENSIONS [2]. Lower quality but energy-eﬃcient queries are often suﬃcient to provide approximate answers for spatial-temporal queries when users have no a priori knowledge about what to look for or where is event of interest. According to these approximate answers, users (or experts in that ﬁeld) can eﬃciently and quickly go through a large number of sensor nodes, thereby choosing the small subsets of nodes of interest. Subsequently, higher overhead queries can be described, if necessary, to obtain the chosen data with a much better quality level. The rest of the paper is organized as follows. Section 2 formulates the optimal rate allocation problem. In Section 3, an analytical optimization framework for multihop sensor networks is presented. Extensive simulations are given in section 4. Finally, section 5 presents concluding remarks.

ORA of Compressed Data Streams in Multihop Sensor Networks

61

Distortion (MSE)

30

20

12.5

5 2.5 0

0

0.2

0.4

0.6

0.7

0.8

1

Rate (bits/sample)

Fig. 2. An illustration of rate allocation problem

2

Problem Statement and Assumptions

Consider a dense wireless multihop sensor network that includes many ﬁxedlocation sensors deployed over an interested ﬁeld. Each sensor makes a local observation of some underlying physical phenomenon, such as temperature, humidity and others. We assume a data logging applications, where sensors are required to send their sensed data constantly at a certain rate. For the sake of simplicity, we assume that each sensor node generates one sample per unit time, and to save power, the sensed samples are temporarily stored in the local buﬀer of size M . When the local buﬀer becomes full, the buﬀered samples are compressed using transform coding techniques and the resulting bitstream is transferred to the sink through a wireless network. For convenience, this paper considers a discrete-time model, in which the physical phenomenon is modeled as an i.i.d. discrete-time random vector (with spatial correlation), denoted by Yi = [yi1 , yi2 , ..., yiM ]T . We assume that the total number of sensor nodes is N . Let Yi = [yi1 , yi2 , ..., yiM ]T be the reconstructed data vector of Yi . The two most important factors of transform coding techniques are the source coding rate and the data ﬁdelity (quality). In this paper, source rates are expressed ¯ which usually determine the in average bits per source sample, denoted by R, amount of the total transferred data rate. On the other hand, the data ﬁdelity, denoted by D, often decides how good the collected data is, and it is often measured by mean square distortion deﬁned as follows. ¯ D(R)=

N 1 E[||Yi − Yi ||2 ] N i=1

(1)

Let Rt denote our target rate. Fig. 2 shows an illustration of the rate allocation problem, in which s2 has higher compression ratio than s1 . Assume that our target rate is 0.8 bits/sample, i.e., Rt = 0.8. Clearly, D1 = 12.5 and D2 = 2.5 if ¯1 = R ¯ 2 = 0.4 bits/sample. Therefore, the uniform allocation is employed, i.e., R the overall distortion is D = D1 + D2 = 15. However, if we give more bits to ¯ 1 = 0.6 and R ¯ 2 = 0.2. Then we will s1 and cut down the rate of s2 , such as R

62

C.L. Lin and J.S. Wang

get D1 = 5.5 and D2 = 5, and D = D1 + D2 = 10.5, which yeilds 5 distortion reduction. In this paper, rate allocation means the process by which bit rates are assigned to the sensors. Given a target distortion Dt , the problem of optimal rate allocation can be formulated as follows: Minimize

N

¯ i subject to D(R) ¯ ≤ Dt R

(2)

i=1

¯ is the distortion function whose argument is the rate allocation where D(R) ¯ = [R ¯1 , R ¯ 2 , ..., R ¯ N ]T . For the non-optimized system (uniform rate coeﬃcients R ¯ ¯ ¯ N = Rt /N . allocation), R1 = R2 = ... = R The purpose of this paper is to present an optimization framework for multihop sensor networks with the objective of minimizing the overall distortion (total transfer rate). It provides a fundamental theoretical study on rate (distortion) distribution for applications constrained on limited target rates (distortion). We assume that energy hole problem (i.e., unbalanced energy utilization) can be mitigated by using existing strategies, such as nonuniform node distribution (adding more nodes to the traﬃc-intensive areas) [18], [19], sink movement [20], [21], hierarchical deployment [17] and dormant-active scheduling [14].

3 3.1

Optimal Rate Allocation Network Architecture and Notations

We address the rate optimization problem in a cluster-based multihop sensor network shown in Fig. 3, in which the sensor nodes (sources) are clustered into groups and each group is coordinated by a group leader. In the following, we will use a three-tired sensor network for ease of presentation (It is easily to extend the presented approach to networks that include mulit-layer groups). It is assumed that the compressed data streams of sensors are sent and congregated in their group leaders instead of transmitting to the sink. Once the group leaders collect all the data streams of their child nodes (group members), they will further compress the data by exploiting spatial redundancy. To clearly formulate the rate allocation problem in a cluster-based sensor network, some necessary notations are introduced as follows. Ng : the number of groups gi : the leader of group i. ni : the number of nodes in group i. sij : the child node of gi , j = 1...ni . Cij , γij : the parameter of the rate distortion function of sensor sij . Cgi , γgi : the parameter of the rate distortion function of the group leader gi . Yij : the random data vector gathered by sensor sij . ¯ ij . Yij : the reconstructed data vector of Yij with rate R Ygi : the collection of Yij , i.e., Ygi = {Yij |j = 1, ..., ni }

ORA of Compressed Data Streams in Multihop Sensor Networks

s11 Source

s12

R11 , D11

Group leader

g1 R g1 , Dg1

R12 , D12

…

s1n1 Group 1

63

R1 , D1

Sink

R1n1 , D1n1

…

Fig. 3. The cluster-based multihop sensor networks

¯ gi ., i.e., Ygi : the reconstructed data vector of Ygi with rate R Ygi = {Yij |j = 1, ..., ni }. Dij : (ﬁrst-layer distortion) the output distortion of sensor sij ¯ ij , i.e., Dij = E[||Yij − Y ||2 ]. with rate R ij Dgi : (second-layer distortion) the distortion of the group leader i noutput 2 i ¯ gi , i.e., Dgi = with rate R j=1 E[||Yij − Yij || ]. ¯ i , i.e., Di : the output i with rate R ni distortion of group Di = j=1 E[||Yij − Yij ||2 ]. 3.2

Construction of the Spatial Rate-Distortion Function (S-RDF)

At lower rates, Mallat and Falzon [3] showed that the rate distortion functions (T-RDF) of sensors can be approximated by 1

1

¯ ij (Dij ) ≈ C 2γij −1 D 1−2γij R ij ij

(3)

where Cij > 0 and γij > 12 . The parameters Cij and γij typically characterize the compression ratio of source data generated by sensor i (referred to as the compression ratio of sensor i hereafter). At lower rates, the higher the compression ratio is, the smaller are the parameters Cij and γij . In the following, we will present a linear regression-based method for constructing the rate distortion function of group leader gi (S-RDF) by using the T-RDF’s of its child nodes sij ’s plus a spatial compression coeﬃcient. By using the low-rate approximation of rate distortion in Eq. (3), we assume that a S-RDF can be modeled as 1

1

1−2γg gi −1 ¯ gi = ηi · Cg2γ R Dgi i i

(4)

where γgi and Cgi are functions of γij ’s and Cij ’s, j = 1, ..., ni , respectively. ηi > 0 represents the compression ratio coeﬃcient (We shall discuss this parameter in the next section). Eq. (4) can be rewritten as ¯ gi = ηi · fi (C1 , ..., γni ) R where

1 2γg −1 i

fi (C1 , ..., γni ) = Cgi

1 1−2γg i

Dgi

(5)

(6)

64

C.L. Lin and J.S. Wang

By taking logarithm on the both sides of Eq. (6) and after some manipulations, it follows ln Dgi = (1 − 2γgi ) ln fi + ln Cgi (7) Eq. (7) is a line equation with the slope (1 − 2γgi ) and y−intercept ln Cgi respectively. Likewise, according to Eq. (3), the parameters γij and Cij can be constructed as the lines Lj : yj = (1 − 2γij )xj + ln Cij , for j = 1, ..., ni . Let y = a ˆx + ˆb be the best-ﬁt line for the lines Lj , j = 1, ..., ni . From Eq. (7), it follows 1−a ˆ ˆ (8) γgi = and Cgi = 2b 2 Fig. 4 gives a simple illustration of the linear regression-based construction. Theorem 1 states that the linear regression-based construction can be simpliﬁed as computing the geometric mean and arithmetic mean respectively. Theorem 1. ( S-RDF Construction) By using the least-square estimator (LSE), γgi =

ni ni 1 γij and Cgi = ( Cij )1/ni ni j=1 j=1

(9)

Proof. Consider the sampling points ∀k ∈ {1, ..., q}. Let ni q x1 , ..., xq , where xk ≥ 0, ni yk = n1i j=1 yj (xk ), y¯j = 1q k=1 yj (xk ) and y¯ = n1i j=1 y¯j . By using the ˆ least-square estimator (LSE), a ˆ and b are computed as follows q

i i −x ¯)( n1i nj=1 yj (xk ) − n1i nj=1 y¯j ) a ˆ= q ¯)2 k=1 (xk − x ni q 1 ¯)(yj (xk ) − y¯j ) j=1 k=1 (xk − x = ni q ¯)2 k=1 (xk − x n i 1 = (1 − 2γij ) ni j=1 k=1 (xk

(10)

ni ni 1 1 ˆb =¯ y−a ˆx ¯= (¯ yj − (1 − 2γij )¯ x) = ln Cj ni j=1 ni j=1

from which it follows γgi

1− 1−a ˆ = = 2 ˆ b

Cgi = 2 = 2

ln(

ni

1 ni

j=1

ni

j=1 (1

ni 1 = γij ni j=1

− 2γij )

2 Cj )1/ni

=(

ni

(11)

Cj )

1/ni

j=1

In order to get an analytical solution to the optimal rate allocation problem, it is also necessary to model the distortion propagation when some loss of quality is allowed at the internal group leaders. In Lemma 2, we introduce a linear

ORA of Compressed Data Streams in Multihop Sensor Networks

65

(xk, yj(xk)) y

(xk, yk)

The best-fit line: y = (1− 2γ g ) x + ln Cg i i

(xk, yj-1(xk))

Lj-1

Lj xk

x

Fig. 4. An illustration of the linear regression-based construction of a S-RDF

distortion propagation model by the assumption of the optimality for the scalar quantization. Lemma 1 states that for an optimal scalar quantizer the quantization error is zero mean and the quantization error is uncorrelated with the quantizer output [13]. Lemma 1. Let Q denote a scalar quantizer and Y = Q−1 (Q(Y )). If Q is optimal, then E[(Y − Y )] = 0 and E[(Y − Y )Y ] = 0. Proof. See reference [13]. Lemma 2. ( Distortion Propagation Model) By the assumption of the optimality of the scalar quantization, Di ≈ Dgi +

ni

Dij

(12)

j=1

Proof. Di =

ni

E[||Yij − Yij ||2 ] =

j=1

ni

E[||Yij − Yij ||2 ] +

j=1

+2·

ni

E[||Yij − Yij ||2 ]

j=1 ni

(13)

E[(Yij − Yij )t (Yij − Yij )]

j=1

By lemma 1, it follows Di ≈

ni j=1

3.3

E[||Yij − Yij ||2 ] +

ni

E[||Yij − Yij ||2 ] = Dgi +

j=1

ni

Dij

(14)

j=1

Optimal Rate Allocation

The average transmission rate between gi and the sink can be expressed as ¯ i = ni · R ¯ gi R

(15)

66

C.L. Lin and J.S. Wang

The average transmission rate of group i is ni

¯ ij + ni R ¯ gi R

(16)

j=1

Therefore, the average transmission rate of the whole network is ⎛ ⎞ Ng ni ¯ ij + ni · R ¯ gi ⎠ ⎝ Rtotal = R i=1

(17)

j=1

Using Eq. (12) as the distortion approximation for each feasible rate allocation, the rate allocation problem in a cluster-based multihop sensor network can be clearly formulated as Minimize Rtotal

subject to

Ng ni ( Dij + Dgi ) ≤ Dt

(18)

i=1 j=1

Using the Lagrange multiplier technique, the necessary conditions for the optimality are ¯ ij /∂Dij = −λ, for i = 1, .., Ng and j = 1, .., ni ∂R ¯ gi /∂Dgi = −λ, , for i = 1, .., Ng and j = 1, .., ni ni · ∂ R

(19)

where λ ≥ 0 is the Lagrange multiplier. According to Eq. (4), 2γij 2γg 1 1 i ¯ ij ¯ gi ∂R 1 ∂R ηi 2γg −1 1−2γg 2γ −1 1−2γ = · Cij ij Dij ij and = · Cgi i Dgi i ∂Dij 1 − 2γij ∂Dgi 1 − 2γgi

(20) Substitute Eq. (20) into Eq. (19) to get the distortion functions of λ, Dij (λ) =

1−2γij 1−2γg i αij ni ηi βgi · λ 2γij and Dgi (λ) = · λ 2γgi 2γij − 1 2γgi − 1 1

C

(2γ

−1)

(21)

1

where αij = (Cij (2γij − 1)) 2γij and βgi = ( gi ni ηgii ) 2γgi . By using the total distortion constraint, one obtains ⎛ ⎞ Ng ni

1−2γg 1−2γij i n η β α i i g ij i ⎝ f (λ) = · λ 2γgi + · λ 2γij ⎠ − Dt = 0 (22) 2γ 2γ − 1 g ij i − 1 i=1 j=1 Lemma 3. ( Uniqueness) f (λ) is both necessary and suﬃcient for the optimality of the rate allocation problem in Eq. (18). Proof. Since γij > 1/2 and Cij > 0, ∀i, j, from Eq. (9) it follows γgi =

ni ni 1 1 γij > and Cgi = ( Cij )1/ni > 0 ni j=1 2 j=1

(23)

ORA of Compressed Data Streams in Multihop Sensor Networks

67

From which it follows βgi > 0. The ﬁrst and second derivatives of f (λ) with respect to the argument λ are Ng

n

i 1−4γg 1−4γij −ni ηi βg i ∂f (λ) −αij i = ( · λ 2γgi + ( · λ 2γij )) < 0 ∂λ 2γgi 2γij i=1 j=1

Ng

n

i 1−6γg 1−6γij i ∂ 2 f (λ) ni ηi βgi (4γgi − 1) αij (4γij − 1) 2γg 2γij i = ( · λ + ( · λ )) > 0 2 2 ∂2λ 4γ 4γ g ij i i=1 j=1

(24) Hence, f (λ) is a convex and strictly decreasing function. This concludes the proof. Theorem 2. ( Optimal Rate Allocation) Let Cij , γij , Cgi and γgi be the rate distortion parameters as deﬁned previously. Given the target distortion Dt , the optimal rate allocation of Eq. (18) is 1

1

¯ opt = αij · (λ∗ ) 2γij and R ¯ gopt = ηi · βgi · (λ∗ ) 2γgi R ij i

(25)

where λ∗ is a zero root of f (λ). The minimized average rate is ¯ opt = R

Ng ni

1

(αij · (λ∗ ) 2γij ) +

i=1 j=1

Ng 1 ni · βgi · (λ∗ ) 2γgi

(26)

i=1

Proof. From Eqs. (4) and (21), the optimal rates are 1

¯ opt = C 2γij −1 R ij ij

1−2γij αij · (λ∗ ) 2γij 2γij − 1 1

1 1−2γ

1 1−2γ

ij

1

1

= (Cij (2γij − 1)) 2γij −1 · αij ij · (λ∗ ) 2γij = αij · (λ∗ ) 2γij 1

1−2γ 1 1−2γg gi i ni ηi βgi gi −1 ∗ 2γgi ¯ opt = ηi Cg2γ R · (λ ) i gi 2γgi − 1

1 1 1 1 Cgi (2γgi − 1) 2γgi −1 1−2γg = ηi · βgi i · (λ∗ ) 2γgi = ηi · βgi · (λ∗ ) 2γgi ni ηi

(27)

From Eq. (17), the minimum average rate is ¯ opt = R

Ng ni i=1 j=1

4

1

(αij · (λ∗ ) 2γij ) +

Ng 1 ni · βgi · (λ∗ ) 2γgi i=1

Simulation Results

In this section, we will present some simulation results.

(28)

68

C.L. Lin and J.S. Wang Table 1. The properties of the LEM and Intel Lab data sets Max(γi ) Min(γi ) Max(Var(γi )) Min(Var(γi )) Avg(Var(γi )) 1.647

1.234

0.025

5.81E-05

0.006

Max(Ci ) Min(Ci ) Max(Var(Ci )) Min(Var(Ci )) Avg(Var(Ci )) LEM

0.0195

0.0005

5.38E-05

2.05E-07

4.79E-06

Max(γi ) Min(γi ) Max(Var(γi )) Min(Var(γi )) Avg(Var(γi )) 1.476

0.977

0.013

0.005

0.008

Intel Max(Ci ) Min(Ci ) Max(Var(Ci )) Min(Var(Ci )) Avg(Var(Ci )) Lab

4.1

0.0069

0.0002

1.93E-06

4.04E-07

1.52E-06

Dataset and Simulation Environment

We simulated a cluster-based multihop sensor network that includes 64 sensor nodes, four group leaders (each connects to 16 sensors) and one sink (see Fig. 3). The sensor readings were simulated using the real temperature traces provided by the Live from Earth and Mars (LEM) project [4] at the University of Washington and the Intel Berkeley Research lab [5]. We used the temperature traces in the LEM dataset logged from January 2006 to January 2007. We extracted many subtraces starting at diﬀerent dates. The subtraces starting at successive dates are similar. To simulate the spatial correlation of sensor readings, the subtraces starting at successive dates were assigned to neighboring nodes in the simulated network. The summarized properties of the LEM and Intel datasets are shown in Table 1. The buﬀer size of sensors was set to be 256 samples, i.e., M = 256. The embedded zerotree wavelet (EZW) coding was implemented at each sensor (and gateway) to encode the buﬀered data [10]. 4.2

Performance Gain Evaluation

Firstly, we study the data rate reduction (data-rate saving) of the optimal allocation as compared to the uniform allocation. We deﬁne the performance measure, referred to as rate gain, as follows. Ng ni ¯ u Ng ¯u ¯u R i=1 i=1 ni · Rgi j=1 Rij + Rgain = ¯ opt = Ng n (29) i ¯ opt + Ng ni · R ¯ gopt R R i i=1

j=1

ij

i=1

¯ u and R ¯ opt are the resulting rates of node sij by using the uniform where R ij ij allocation and the optimal allocation respectively, constrained on the same tar¯ u and R ¯ opt are the resulting rates of gi . The rate get distortion. Similarly, R gi gi gain Rgain indicates the ratio of the average transmission rate of the uniform allocation to the optimal allocation. The rate gain of the optimal allocation is depicted in Fig. 5(a). It can be seen that the gain is a monotonically decreasing function of the target distortion. In

ORA of Compressed Data Streams in Multihop Sensor Networks

69

Fig. 5. The performance gain of the optimal allocation over the uniform allocation. (a) the performance gain. (b) the spatial correlation coeﬃcient.

other words, the gain of the optimal allocation will be more signiﬁcant at lower target distortion. This is reasonable because that the resulting rate is typically an exponentially decreasing function of the distortion. That is, at smaller distortion a small change in distortion often brings greater ﬂuctuation in rate. This result can be also explained by Fig. 2, from which we can observe that as the distortion is small, the resulting rate of each sensor varies greatly as constrained on the same distortion. On the other hand, as the distortion is large enough, there is little diﬀerence between the resulting rates of sensors. Fig. 5(a) also exhibits that the optimal allocation has larger gain on LEM dataset than on Intel dataset. The reason is that the compression ratio of the Intel dataset is typically better than that of the LEM dataset, as illustrated in Table 1. In the above simulations, the coeﬃcients ηi were set to be equal, i.e., η = ηi for 1 ≤ i ≤ 4. The optimum coeﬃcient η was empirically tuned and the collected samples are depicted in Fig. 5(b). Fig. 5(a) and Fig. 5(b) show that the larger the coeﬃcient η is, the better is the rate gain. Moreover, the LEM dataset generally has larger η values than the Intel dataset. This is because that the Intel dataset generally has higher spatial correlation than the LEM data, i.e., the compression ratio of the Intel dataset is better than the LEM dataset at the internal nodes. This gives another explanation that the optimal allocation has better rate gain on LEM dataset than on Intel dataset. Another interesting insight is that the η values of the Intel and LEM datasets would decrease as the target distortion increases. This result hints that cutting down the total rate would raise the spatial correlation of sensor data. This is because that as the total rate is small, the temperature values would be quantized to almost the same value, and thus increase the spatial correlation of sensor data. 4.3

Accuracy of Distortion Propagation Model

The distortion propagation model is also evaluated by comparing the real distortion of the collected data streams with the target distortion. The simulated results are shown in Fig. 6. On average, the real distortion is very close to the target distortion and the standard error (S.E.) is smaller at lower target distortion. This result shows that the proposed distortion propagation model is more

70

C.L. Lin and J.S. Wang

(a)

(b)

Fig. 6. The accuracy of the proposed distortion propagation model. (a) Intel Lab data. (b) LEM data.

accurate at lower target distortion, but even at higher target distortion, the difference between the real distortion generated by the optimal allocation and the target distortion is still small. For example, the overall distortion produced by the optimal allocation diﬀers from the target distortion at most 1.5 MSE as the target distortion is set to be 10 MSE.

5

Conclusion

This paper presents an analytical and architecturally appealing optimization framework that provides the optimal system beneﬁt, in terms of the overall rate (quality) gain over the uniform allocation, for a wireless multihop sensor network. The simulation results show that the optimal allocation has better total rate gain at smaller target distortion (has better quality gain at lower target rate). Moreover, the optimal rate allocation is more signiﬁcant when the sources of the target applications have higher compression ratios or have higher variation of the compression ratios. The simulation results also indicate that the S-RDF’s of the network can be eﬀectively estimated by using the T-RDF’s.

References 1. Akyildiz, I.F., Melodia, T., Chowdury, K.R.: Wireless multimedia sensor networks: A survey. Wireless Communications, IEEE 14(6), 32–39 (2007) 2. Ganesan, D., Estrin, D., Heidemann, J.S.: Dimensions: why do we need a new data handling architecture for sensor networks? Computer Communication Review 33(1), 143–148 (2003) 3. Mallat, S., Falzon, F.: Analysis of low bit rate image transform coding. IEEE Transactions on Signal Processing 46(4), 1027–1042 (1998) 4. Live from Earth and Mars (LEM) Project, http://www-k12.atmos.washington.edu/k12/grayskies/ 5. Intel Berkeley Research Lab, http://berkeley.intel-research.net/labdata

ORA of Compressed Data Streams in Multihop Sensor Networks

71

6. Krishnamachari, B., Estrin, D., Wicker, S.: The Impact of Data Aggregation in Wireless Sensor Networks. In: 22th International Conference on Distributed Computing Systems Workshops, pp. 575–578 (2002) 7. Deligiannakis, A., Kotidis, Y., Roussopoulos, N.: Dissemination of compressed historical information in sensor networks. J. VLDB 16(4), 439–461 (2007) 8. Yoon, S., Shahabi, C.: The Clustered AGgregation (CAG) technique leveraging spatial and temporal correlations in wireless sensor networks. ACM Trans. Sen. Netw. 3(1), 3 (2007) 9. Pradhan, S.S., Kusuma, J., Ramchandran, K.: Distributed compression in a dense microsensor network. Signal Processing Magazine, IEEE 19(2), 51–60 (2002) 10. Shapiro, J.M.: Embedded image coding using zerotrees of wavelet coeﬃcients. IEEE Transactions on Signal Processing 41(12), 3445–3462 (1993) 11. Ailamaki, A., Faloutos, C., Fischbeck, P.S., Small, M.J., VanBriesen, J.: An environmental sensor network to determine drinking water quality and security. SIGMOD Rec. 32(4), 47–52 (2003) 12. Mainwaring, A., Culler, D., Polastre, J., Szewczyk, R., Anderson, J.: Wireless sensor networks for habitat monitoring. In: 1st ACM International Workshop on Wireless Sensor Networks and Applications, pp. 575–578 (2002) 13. Gersho, A., Gray, R.M.: Vector quantization and signal compression, ch. 6. Kluwer Academic Publishers, Norwell (1991) 14. Cardei, M., Thai, M.T., Yingshu, L., Wu, W.: Energy-eﬃcient target coverage in wireless sensor networks. In: 24th IEEE International Conference on Computer and Communications, pp. 1976–1984 (2005) 15. Suzuki, M., Saruwatari, S., Kurata, N., Morikawa, H.: A high-density earthquake monitoring system using wireless sensor networks. In: 5th International Conference on Embedded networked sensor systems, pp. 373–374 (2007) 16. Madden, S., Franklin, M.J., Hellerstein, J.M., Hong, W.: TAG: a Tiny AGgregation service for ad-hoc sensor networks. SIGOPS Oper. Syst. Rev. 36, 131–146 (2002) 17. Li, J., Mohapatra, P.: Analytical modeling and mitigation techniques for the energy hole problem in sensor networks. Pervasive Mob. Comput. 3(3), 233–254 (2007) 18. Wu, X., Chen, G., Das, S.K.: Avoiding Energy Holes in Wireless Sensor Networks with Nonuniform Node Distribution. IEEE Transactions on Parallel and Distributed Systems 19(5), 710–720 (2008) 19. Lian, J., Naik, K., Agnew, G.B.: Data Capacity Improvement of Wireless Sensor Networks Using Non-Uniform Sensor Distribution. International Journal of Distributed Sensor Networks 2(2), 121–145 (2006) 20. Shi, Y., Hou, Y.T.: Theoretical Results on Base Station Movement Problem for Sensor Network. In: 27th IEEE International Conference on Computer and Communications, pp. 1–5 (2008) 21. Younis, M., Bangad, M., Akkaya, K.: Base-station repositioning for optimized performance of sensor networks. In: 58th IEEE International Conference on Vehicular Technology, pp. 2956–2960 (2003)

Mote-Based Online Anomaly Detection Using Echo State Networks Marcus Chang1 , Andreas Terzis2 , and Philippe Bonnet1 1

Dept. of Computer Science, University of Copenhagen, Copenhagen, Denmark Dept. of Computer Science, Johns Hopkins University, Baltimore MD, USA

2

Abstract. Sensor networks deployed for scientiﬁc data acquisition must inspect measurements for faults and events of interest. Doing so is crucial to ensure the relevance and correctness of the collected data. In this work we unify fault and event detection under a general anomaly detection framework. We use machine learning techniques to classify measurements that resemble a training set as normal and measurements that signiﬁcantly deviate from that set as anomalies. Furthermore, we aim at an anomaly detection framework that can be implemented on motes, thereby allowing them to continue collecting scientiﬁcally-relevant data even in the absence of network connectivity. The general consensus thus far has been that learning-based techniques are too resource intensive to be implemented on mote-class devices. In this paper, we challenge this belief. We implement an anomaly detection algorithm using Echo State Networks (ESN), a family of sparse neural networks, on a mote-class device and show that its accuracy is comparable to a PC-based implementation. Furthermore, we show that ESNs detect more faults and have fewer false positives than rule-based fault detection mechanisms. More importantly, while rule-based fault detection algorithms generate false negatives and misclassify events as faults, ESNs are general, correctly identifying a wide variety of anomalies. Keywords: Anomaly detection, Real-time, Wireless Sensor Networks.

1

Introduction

Sensor networks deployed to collect scientiﬁc data (e.g., [1,2,3]) have shown that ﬁeld measurements are plagued with measurement faults. These faults must be detected to prevent pollution of the experiment and waste of network resources. At the same time, networks should autonomously adapt to sensed events, for example by increasing their sampling rate or raising alarms. Events in this context are measurements that deviate from “normal” data patterns, yet they represent features of the underlying phenomenon. One such example would be rain events in the case of soil moisture. The problem is that algorithms which classify measurements that deviate from the recent past as faulty tend to misclassify events as faults [4]. This behavior is undesirable because, unlike faults which must be discarded, events are the B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 72–86, 2009. c Springer-Verlag Berlin Heidelberg 2009

Mote-Based Online Anomaly Detection Using Echo State Networks

73

most important data that a mote collects, as they inform scientists about the characteristics of the observed environment. Furthermore, detection algorithms tailored to speciﬁc types of faults lead to false positives when exposed to multiple types of faults [4]. In this work we unify fault and event detection under a more general anomaly detection framework, in which online algorithms classify measurements that signiﬁcantly deviate from a learned model of data as anomalies. By including punctuated, yet infrequent events in the training set we avoid the misclassiﬁcation problem mentioned above thus allowing the system to distinguish faults from events of interest. More importantly, this learning-based technique can eﬀectively detect measurement sequences that contain multiple categories of anomalies that do not exist in the training data. Obviously anomaly detection can and should also be done on a gateway that correlates data from multiple sensors. Nonetheless, we claim that online detection on motes is also very much relevant. We motivate this need through an example derived from one of our ongoing projects [5]. Consider a set of motes deployed under the surface of a lake with limited physical access. These motes are connected to a ﬂoating buoy via acoustic modems which can be non-functional over long periods of time, either because the buoy is out of communication range or due to background noise in the lake. The motes should be able to autonomously alter their sensing behavior depending on whether the collected measurements are seemingly faulty or correspond to interesting events. For example, faulty measurements should be replaced in a timely manner by new measurements while interesting events should trigger sampling rate increases. In summary, the contributions of this paper are as follows: (1) we develop an anomaly detection framework based on the Echo State Network (ESN) [6]. (2) We implement this framework on a mote-class device. (3) We quantitatively compare the ESN with two rule-based fault detection techniques. Speciﬁcally, we show that an ESN small enough to function alongside a fully-functional environmental monitoring mote application, is still more sensitive to subtler faults and generates fewer false positives than the two rule-based fault detection techniques.

2

Related Work

Anomaly characterization and detection has received signiﬁcant attention in the sensor network community, yielding a broad range of algorithmic approaches. Probabilistic Principal Component Analysis [7], geometric algorithms [8], and Support Vector Machines [9], detect anomalies by partitioning data into subsets and subsequently identifying outliers. However, the temporal relations among data points are lost in such partitioning. We seek a solution that not only considers each data point in isolation, but also the context in which it appears. Rashidi et al. recast the problem above as a pattern recognition problem and built a framework for pattern mining and detection [10], while R¨ oemer used conditional rules to deﬁne anomalies [11]. However, neither of these solutions operate directly on raw measurements. Rather they rely on simple rules and

74

M. Chang, A. Terzis, and P. Bonnet

thresholds to annotate the data with descriptive labels. The accuracy of both methods thereby depends on those labeling algorithms. Sensor networks have extensively used rule- and threshold-based anomaly detection schemes due to their simplicity. For example, Werner-Allen et al. used threshold rules over Exponentially Weighted Moving Averages (EWMA) to detect seismological events [12], while analogous threshold techniques have been used to detect Cane toads [13] and vehicles [14]. In the context of environmental monitoring, Sharma et al. proposed two rules to detect faults commonly observed by such applications: Short faults, deﬁned as drastic diﬀerences between two sequential measurements, and Noise faults, deﬁned as periods during which measurements exhibit larger than normal variations [15]. To detect the former, the Short rule compares two adjacent data points and classiﬁes the more recent as faulty when the diﬀerence is above a certain threshold. To detect the latter, the Noise rule considers a sliding window of measurements and ﬂags all measurements in the window as faulty if the standard deviation is above a certain threshold. While these detection schemes are very resource eﬃcient, their eﬀectiveness is limited. For example, Werner-Allen et al. estimated the accuracy of their detection technique to be as low as 5%-29% [12]. Moreover, these schemes also suﬀer from inherent misclassiﬁcation problems [4]. We thus seek a solution based on machine learning. The use of machine learning as an anomaly detection tool has been proposed in the context of WSNs. For example, Echo State (neural) Networks [16] and Bayesian Networks [17] have been proposed for oﬄine gas monitoring, while Kalman ﬁlters have been used for oﬄine sow monitoring [18]. Bokareva and Bulusu used a Competitive Learning Neural Network (CLNN) for online classiﬁcation [19]. However, the neural network was implemented on a Stargate gateway rather than a mote-class device. We bridge the gap between online detection and machine learning by implementing a learning-based technique on a mote.

3

Machine Learning

We propose a classiﬁcation mechanism that accepts measurements matching a model as valid and rejects everything else as anomalies, where we deﬁne anomalies as the measurements that signiﬁcantly deviate from learned data. We rely on machine learning techniques to deﬁne our classiﬁcation model and focus on supervised learning because our scientiﬁc partners are able to provide training sets that correspond to the data they expect. Such data sets include previously collected measurements and synthetic data generated by analytical models. What learning technique should we choose to achieve both accurate anomaly detection and eﬃcient implementation on a mote-class device? We rule out Kalman ﬁlters, because they base each prediction on a sliding window of observed values instead of a compact model of learned data. Likewise, a Bayesian network’s graph reduction operation (which is NP-complete) and the modeling of probability functions (typically Gaussians), discourage its use on resource

Mote-Based Online Anomaly Detection Using Echo State Networks

75

constrained devices. Consequently, we decided to use ESN to meet our requirements in terms of classiﬁcation eﬃciency (i.e., minimize false classiﬁcations) and resource use (i.e., minimize CPU, RAM, ROM, and energy usage). 3.1

Neural Networks

A neural network can be informally considered as an approximation function. Speciﬁcally, when presented with a subset of the original function’s value pairs during the training stage, the neural network generalizes over these data and approximates the outcome of the original function in the prediction stage. Formally, a neural network is a weighted directed graph where each vertex represents a neuron. We consider discrete-time networks consisting of K input neurons, N hidden neurons, and L output neurons. The input neurons act as sources and the output neurons as sinks. The value of neuron j is given by: vj = A( wij vi ), where vi is the output of neuron i, wij is the weight of the edge connecting neuron i to j, and A() is the activation function. This function is typically tanh() or a similar function. The training stage consists of adjusting the network’s weights to approximate its output signal to the training signal. Echo State Networks. In an ESN, all neurons are interconnected (but can have zero-weighted edges) meaning cycles involving one or more neurons are allowed. This gives each neuron the capability to remember, adding memory to the network as a whole. All the neurons’ connections, directions, and weights are generated randomly and do not change, except for the output weights which are changed during training. The neurons thus act as a black box referred to as the Dynamic Reservoir (DR). This property reduces the learning algorithm to a simple linear regression. According to the Echo State property [6], the DR contains a set of basis states and by adjusting the output weights it is possible to capture the ’echoes’ of real states as linear combinations of these basis states. Although the DR is randomly generated, Jaeger proved that it is possible to ensure that the DR indeed has the Echo State property by enforcing certain conditions [6] . One such condition is that the DR must be sparsely connected, i.e., 10% of all possible connections are actually active. Anomaly Detection. We use ESNs to determine whether sensor readings are anomalous by comparing the ESN predictions to the actual measurements. In order to quantify the prediction error we look at the absolute diﬀerences between the measurements (M ) and the predictions (P ), i.e., δ = |M −P |. This diﬀerence should ideally be close to zero for normal data, while anomalous data should result in large diﬀerences (peaks). In other words, the ESN transforms the original time series into one whose values are ∼ 0 most of the time, corresponding to the expected data. Anomaly detection thus reduces to recognizing the peaks in the transformed signal. We can then use pattern matching algorithms based on simple thresholds that have been proven to be both eﬃcient and eﬀective for such simple signals.

76

M. Chang, A. Terzis, and P. Bonnet

NRMSD −3

x 10

δ lab/tanh (%) δ mote/tanh (%) δ lab/tl (%) δ mote/tl (%) 0.15448 0.15470 0.5855014 0.5855008

(a) tanh

(b) tanhlike 0.02 mote/tl

0

δ

0

δ

mote/tanh

5

−5

−0.02 −5

δ

0

lab/tanh

5

−0.02 −3

x 10

δ

0

0.02

lab/tl

Fig. 1. (a) Q-Q plot of δ lab/tanh and δ mote/tanh . (b) Q-Q plot of δ lab/tl and δ mote/tl .

3.2

Discussion

The decoupling of the DR from the output weights enables several optimizations that ﬁt WSNs particularly well. For example, the same DR can be used for multiple tasks by storing task-speciﬁc output weights and post-deployment updating can be done without transmitting the entire DR. Also, the requirement that the DR is sparsely connected, combined with the use of sparse matrix algebra, allows the implementation of ESNs that are larger than regular Feed Forward networks. A limitation that ESNs share with all learning algorithms is their dependence on training data. In particular, if these data do not represent what domain scientists deem as “normal”, the predictions will be useless. Therefore, the choice of training sets, and more interestingly the choice of classiﬁcation technique based on the available training sets is a very interesting open problem, which is beyond the scope of this paper. We just note that a key issue in successfully deploying an ESN lies in the choice and availability of training data. For example, adjusting the sampling rate in an adaptive sampling environment can change the properties of the measurement time series and thus possibly invalidate the training set. This issue can however be remedied, by storing diﬀerent output weights for each sampling rate, or by disregarding higher sampling rates when applying the ESN detection. On the positive side, ESNs have the ability to generalize over the training data. In other words, ESNs base their predictions on the trends of the presented data rather than exact values. This feature allows motes deployed in similar regions to share the same training data instead of requiring mote-speciﬁc training sets.

4 4.1

ESN on a Mote Implementation

While we create and train the ESNs oﬄine, a complete ESN (including the network’s activation function, output weights, and the DR) is included in the application that runs on the mote. We use TinyOS 2.x to ensure portability

Mote-Based Online Anomaly Detection Using Echo State Networks (a) ROM usage

(b) ROM usage − tanhlike

2

40 Tanhlike Tanh

1 0.5 0

30 KiB

KiB

1.5

77

Framework Weights

20 10

Activation Function

0

10 50 100 200 300 400 Neurons

Fig. 2. (a) ROM footprints for the tanh() and tanhlike functions. (b) Total ROM footprint for an ESN using the custom tanhlike activation function.

to a broad range of mote class devices. Our implementation, publicly available for download at [20], focuses on feasibility and eﬃciency: the ESN must be able to ﬁt in memory and the algorithm must be fast enough to maintain the desired sampling rate. We present the following three optimizations to improve performance along these two axes. Sparse Matrix Algebra. The size of the DR’s weight matrix grows quadratically with the number of neurons in the reservoir n. However, only 10% of these elements are non-zero because the DR must possess the Echo State property. We leverage this feature by storing the matrix using Compressed Row Storage [21], which only stores the non-zero elements and the layout of the matrix. This reduces the necessary storage from O(n2 ) to O(2nz + n + 1), where nz is the number of non-zero elements. This technique also reduces the number of operations needed to perform matrix multiplications by a similar factor since only non-zero elements are considered. Single Floating Point Precision. Most mote-class devices rely on software emulated ﬂoating point operations due to lack of dedicated hardware. This contributes to both the storage and runtime overheads. At the cost of reduced ﬂoating point precision we select to store and compute all values using single instead of double ﬂoating point precision. Doing so halves the size of all the weight matrices and reduces the number of emulated ﬂoating point operations needed. As we later show, the resulting loss of precision is tolerable. Tanhlike Activation Function. Because the activation function has to be applied to all the neurons in every iteration, it is important to choose an eﬃcient function. At the same time, choosing a suboptimal activation function can signiﬁcantly degrade the ESN’s output quality. The algorithm for the often used hyperbolic tangent, tanh(), has high complexity requiring both large amounts of storage and a signiﬁcant processing time. Because of these shortcomings, [22] proposed the approximate function: n 1 2 |x| − 2n |x| T L(x) = sign(x) 1 + 2n |x| −1 2 2

78

M. Chang, A. Terzis, and P. Bonnet (a) Runtime − tanh

6000 4000 2000 0

10 50 100 200 300 400 Neurons

(b) Runtime − tanhlike 8000 milliseconds

milliseconds

8000

Matrix Output Tanh

6000

Matrix Output Tanhlike

4000 2000 0

10 50 100 200 300 400 Neurons

Fig. 3. Total execution cost of one ESN iteration divided to three components. (a) using the GCC built-in tanh() activation function. (b) using the custom tanhlike activation function.

where n ∈ ZZ determines the steepness of the function. This tanhlike function has properties similar to tanh() (when n = 1) but with far lower complexity. However, it is also a non-diﬀerentiable, piecewise-linear function because of the rounding operations (·). Therefore, we expect the quality of the ESN’s output to be lower than when using tanh(), because small changes in input will result in large changes in output if these changes happen across a linear junction. 4.2

Evaluation

We verify that our ESN implementation indeed performs well on a mote-class device by comparing its output to a reference ESN running on a PC. We consider ESNs which consist of two input signals (with one of the input signals held at a constant bias value in order to improve performance [23]), a 10-400 neuron reservoir, and one output signal (i.e., K = 2, N = 10 − 400, and L = 1). All mote experiments are carried out on a TelosB mote [24], running TinyOS 2.x with the clock frequency set to the default speed of 4 MHz [25]. Data sets are stored in ROM with measurements read with a ﬁxed frequency to simulate sensor sampling. We use Matlab R2007a with the Matlab toolbox for ESNs [26] as our reference implementation. We use the Mackey-Glass (MG) time series with a delay τ = 17 [27] to evaluate our ESN implementation. This system is commonly used to benchmark time series prediction methods because of its chaotic nature. Sanity Check. We created a MG time series with 4,000 samples and used the ﬁrst 2,000 samples to train a 50 neuron ESN, the next 1,000 samples for initialization, while the last 1,000 samples were used as the prediction vector M G. Both the tanh() and tanhlike activation functions were used resulting in four diﬀerent predictions: P lab/tanh , P mote/tanh , P lab/tl , and P mote/tl . We compute the four prediction errors and normalized root-mean-squared deviations (NRMSD). Figure 1 presents the Q-Q plots [28] of the prediction errors grouped by activation function. Since the NRMSDs from the same activation function are almost identical and the points in the Q-Q plots lie on a straight line with slope one, we conclude that the TelosB ESN implementation has the same accuracy as the one

Mote-Based Online Anomaly Detection Using Echo State Networks NRMSD(δ)

%

1 0.5 0 10 50 100

200 300 Neurons

400

Fig. 4. NRMSD(δ) for diﬀerent reservoir sizes and activation functions

35 30 25 20

700

750 800 850 Sample No.

12 8 4 0 900

δ / °C

tanhlike tanh

Modified data set / °C

ESN detector

2 1.5

79

Fig. 5. Relation between measurements (middle plot), prediction errors (bottom plot), and injected/detected anomalies (top X/O markers)

in Matlab. Also, with an NRMSD less than 1% we see that the 50-neuron ESN is indeed capable of tracking the MG time series. However, the choice of activation function has a signiﬁcant impact on the accuracy of the predictions, with tanh() being four times more accurate than the tanhlike function. This supports our claim that the piecewise-linearity of the tanhlike function degrades performance. In order to compare the double precision ﬂoating point in Matlab with that of the single precision ﬂoating point on the TelosB, we look at the diﬀerences between predictions from the former with the latter when using the same activation function, i.e., δ tanh = P lab/tanh − P mote/tanh and δ tl = P lab/tl − P mote/tl . We compute the NRMSDs for both error distributions: NRMSD(δ tanh ) = 6.6·10−3 % and NRMSD(δ tl ) = 1.3 · 10−4 %. Because NRMSD(δ tanh ) < NRMSD(δ lab/tanh ) and NRMSD(δ tl ) < NRMSD(δ lab/tl ) the errors caused by using single precision ﬂoating point are smaller than the errors caused by the ESN predictions. Thus, using single precision ﬂoating point on the TelosB is suﬃcient. Performance. In order to explore the implementation’s characteristics, such as ROM footprint, runtime speed, and accuracy, we vary the number of neurons in the DR. The ROM usage can be divided into two components: (1) Framework, the ESN algorithm used for prediction. (2) Weight Matrices, the DR and output weights. Whereas (1) is constant, (2) depends on the number of neurons in the reservoir. Figure 2a presents the ROM size diﬀerence for the two activation functions and Figure 2b shows the ROM footprint of the aforementioned components (using tanhlike). We observe that the memory contribution from the reservoir grows linearly, conﬁrming the storage requirement of the Compressed Row Storage (O(2nz + n + 1)). Also, the ROM footprint is 1,806 bytes for tanh() and 368 bytes for tanhlike, making the former ﬁve times larger than the latter. Next we measure the runtime cost of the ESN implementation. For each iteration, the ESN prediction algorithm performs the following set of operations: (1) Matrix, matrix-vector multiplication. (2) Activation Function, application of the activation function. (3) Output, vector-vector multiplication. Figure 3 summarizes the execution time of one prediction step and the contributions from each of the three operations. Surprisingly, the tanh() activation function is the

80

M. Chang, A. Terzis, and P. Bonnet

most expensive operation and not the matrix-vector multiplication. It takes 28% longer to run than the matrix-vector multiplication and 453% longer than the tanhlike activation function. Finally, we look at the prediction error as a function of reservoir size and activation function. We compare against the MG time series and ﬁnd the NRMSD(δ) for the six reservoirs and two activation functions used above. Figure 4 presents the results of this comparison. As expected, the prediction error decreases as the reservoir size increases and the tanh() activation function leads to more accurate predictions in general. Upon closer inspection, there appear to be three distinct regions relative to the reservoir size: small (10 neurons), medium (50-300 neurons), and large (300-400 neurons). In the small region, the prediction error is dominated by the small size of the reservoir and the choice of activation function becomes less important. In the medium region there is a diminishing, yet clear reduction of the prediction error as the reservoir size increases. Finally, in the large region the prediction error does not decrease by adding neurons to the reservoir. Interestingly, the largest contribution to the prediction error comes from the activation function, with no overlap of prediction errors for the 50-400 neuron reservoirs. In fact, even the 50 neuron tanh() reservoir outperforms the 400 neuron tanhlike reservoir.

5

Evaluation

5.1

Experimental Design

The results from the previous section suggest that an ESN can be accurate, small, and fast enough to be incorporated with an existing data collection application that has been actively deployed for the past three years [1]. Motes in these sensor networks collect soil temperature and soil moisture readings every 20 minutes and store them to their onboard ﬂash memory. All measurements are periodically oﬄoaded over the network and persistently stored in a database. This environmental sensing application uses 40,824 bytes of ROM and 3,928 bytes of RAM, leaving 8,328 bytes of available ROM and 6,312 bytes of free RAM on the TelosB. From the previous section we know that a 50-neuron ESN using the tanhlike activation function has a ROM footprint of 6,788 bytes and a prediction time of 572 ms for each measurement. Thereby such an ESN complies (a) Short, β=1

(b) Noise, w=100 and β=1

1

1

0.5

0.5

0

0

−0.5

−0.5

−1

50

100 150 Sample No.

200

−1

50

100 150 200 Sample No.

Fig. 6. Two types of injected anomalies: (a) Short faults and (b) Noise faults.

Mote-Based Online Anomaly Detection Using Echo State Networks (a) Moisture

(b) Temperature 35

0.4 0.38

30 C

0.36

°

rel. humidity

81

0.34

25

0.32 20

0.3 200

400 600 800 1000 Sample No.

200

400 600 800 1000 Sample No.

Fig. 7. Environmental sensing data sets. (a) Soil moisture and (b) Soil temperature.

with both the storage and computation constraints of the application and will be used for the remainder of this section. Anomaly Types. We focus on two types of random anomalies: Short and Noise. These were deﬁned in [15] and presented in Section 2. Samples of these faults can be seen in Figure 6. For brevity we only present these two random anomalies; for the detection of a systematic anomaly and further results we refer to our technical report [29]. For Short anomalies, we use two parameters to control their injection: the sample error rate and the ampliﬁcation factor, β. For each anomalous measurement, m ˜i , we multiply the standard deviation of the original signal, σ, with β to obtain: m ˜ i = mi + βσ, where mi is the true measurement. For Noise anomalies, we use three parameters: the sample error rate, the period length, w, and the ampliﬁcation factor, β. For each noisy period, we calculate the standard deviation of the underlying signal and multiply it with β to create a random normal distribution with zero mean and βσ standard deviation (i.e., N (0, βσ)). We then add samples from this distribution to each of the true measurements within that period. Detection Algorithms. We use the two rule-based anomaly detection algorithms deﬁned by [15] and summarized in Section 2 to detect the two anomalies mentioned above. We use these algorithms as reference as they are directly related to the anomalies we inject and their complexity is comparable to that of currently deployed fault detection algorithms. Our strategy for setting the thresholds is to minimize the number of false positives when the detection algorithms are applied to data sets with no anomalies. Data Sets. For each of the soil moisture and soil temperature modalities that we use, we obtain a training and a test data set from the experiment’s database [1]. Each of the four data sets consists of 1,000 data points. Figure 7 illustrates two such data sets. The data has been automatically sanitized by the database as a standard procedure for removing anomalies, following the methods proposed by [15]. By using this preprocessed data (instead of raw data) our results will not be biased by any anomalies already present in the data stream. Instead, we can assume that the only anomalies in the data are the ones we explicitly inject, thereby establishing the ground truth for evaluation purposes.

82

M. Chang, A. Terzis, and P. Bonnet (a) Short rule/Short fault

(b) Noise rule/Noise fault

(c) Both rules/Both faults

80

80

80

60

60

60 %

100

%

100

%

100

40

40

40

20

20

20

0

0

1

2

3 β

4

5

(d) ESN/Short fault

1

2

3 β

4

0

5

(e) ESN/Noise fault 80

80

60

60

60

40

40

40

20

20

20

0

0

3 β

4

5

3 β

4

5

%

80 %

100

%

100

2

2

(f) ESN/Both faults

100

1

1

1

2

3 β

4

5

0

1

2

3 β

4

5

Fig. 8. Short rule, Noise rule, and ESN detection applied to the moisture data set

5.2

Results

Figure 5 illustrates the operation of the ESN anomaly detection algorithm by presenting the relation between the injected anomalies (Short β = 1; Noise β = 1 and w = 20), the temperature measurements – including the artiﬁcially added anomalies, the prediction error δ, and the detected anomalies. Notice that the prediction error is indeed an almost constant signal overlaid with large peaks coinciding with the injected faults. When not injected with anomalies we ﬁnd that NRMSD(δ T emp ) = 2.4% and NRMSD(δ Moist ) = 4.4% for the temperature and moisture data set respectively. This accuracy is of the same order of magnitude as the one [16] found when tracking gas measurements, meaning that our online implementation is indeed comparable to the oﬄine counterpart. We use a 5% sample error rate (i.e., 5% of the measurements are polluted with errors) for each fault type and a period w = 10 for Noise faults. The ampliﬁcations used for the evaluation are: 1 ≤ β ≤ 5. Figure 8 compares the three algorithms, in the case of moisture data, when applied to the Short faults, Noise faults, and a combination of both faults (5% Short and 5% Noise faults). We only apply each rule to its own domain fault since this is the optimal scenario. The challenge of this data set is the similarity between the onset of rain events and Short faults. In order to avoid false positives the thresholds must be set high enough to avoid triggering the Short rule during the rain events. In the left column, Figure 8(a,d), we compare the Short rule with the ESN detection when applied to Short faults. Not surprisingly the Short rule performs well on this type of fault when β ≥ 3. However, for lower β values the Short rule cannot distinguish between rain events and faults, and detects none of the latter. The ESN is eﬀective for β ≥ 2 but at the cost of more false positives at higher βs. In the middle column, Figure 8(b,e), we compare the Noise rule

Mote-Based Online Anomaly Detection Using Echo State Networks (a) Short rule/Short fault

(b) Noise rule/Noise fault

(c) Both rules/Both faults

80

80

80

60

60

60 %

100

%

100

%

100

40

40

40

20

20

20

0

0

1

2

3 β

4

5

(d) ESN/Short fault

1

2

3 β

4

0

5

(e) ESN/Noise fault 80

80

60

60

60

40

40

40

20

20

20

0

0

3 β

4

5

2

3 β

4

5

%

80 %

100

%

100

2

1

(f) ESN/Both faults

100

1

83

1

2

3 β

4

5

0

1

2

3 β

4

5

Fig. 9. Short rule, Noise rule, and ESN detection applied to the temperature data set

with the ESN detection when applied to Noise faults. Interestingly, the Noise rule does not perform well on its corresponding faults. At β ≥ 3 we see the same trend as before with no false negatives, however, we also see a signiﬁcant number of false positives. This behavior is caused by the aggressiveness of the Noise rule, marking the entire window as faulty rather than individual points. For low β values we still see the ambiguity between events and faults, leading to no positive detections. The ESN detector, however, has no false positives, and a signiﬁcantly lower number of false negatives for β ≤ 2. Finally, for higher β values the number of false negatives is also signiﬁcantly smaller than the number of false positives of the rule-based algorithm. Judging by these results, we conclude that the ESN can match up with the rule based detectors. There is although a trade-oﬀ between false positives and false negatives, since decreasing one often leads to the increase of the other. However, in a real deployment it is not possible to choose what algorithm to use on which faults and we must assume that all faults can appear at any time. In the right column, Figure 8(c,f), we thus compare a hybrid detector using both the Short rule and the Noise rule at the same time on a data set injected with both types of faults. We see that the hybrid detector has the same behavior as the Noise rule, with either high number of false negatives or false positives. On the other hand, the ESN detector is performing signiﬁcantly better across all β values, illustrating the strength of the learning algorithm’s ability to detect what is not normal. Next, we perform the same analysis on the temperature data set, using the same parameters to inject errors. The challenge of this data set, from the perspective of a detection algorithm, is the high temperature variance, caused by the diurnal pattern, that resembles noise faults. As before, the Short rule and faults are in the left column, Figure 9(a,d), Noise rule and faults in the

84

M. Chang, A. Terzis, and P. Bonnet

middle column, Figure 9(b,e), and the hybrid detector on both types of faults in the right column, Figure 9(c,f). One can see that the overall accuracy improves signiﬁcantly, with more faults being detected. Also note that the Noise rule generates a large number of false positives, supporting the claim that the diurnal temperature patterns in the data set can be misclassiﬁed as Noise faults. Again, when used on both faults simultaneously we see that the false positives is the biggest drawback with the hybrid detector. The ESN detector, however, does not misclassify to the same extent, again clearly showing the ESN’s ability to distinguish between normal and anomalous data. 5.3

Discussion

We have shown that, for the modalities we tested, the ESN is capable of detecting low-amplitude anomalies better than speciﬁc rule-based anomaly detectors. At the same time, it is equally eﬀective over multiple anomaly types, as it has the ability to detect a wide range of features deviating from the training set. There are, however, several factors that limit the applicability of ESNs. We identify three key issues: (1) As we saw in Section 4.2 the prediction time for each iteration is in the order of seconds. For environmental monitoring, where changes happen on the scale of minutes, this prediction speed is acceptable. However, this technique might not be feasible for high data rate applications. (2) For deployments in which no historical data are available, the training data will have to be constructed (e.g., from models, experience, etc.) or learned during the deployment. Neither options are desirable, because an artiﬁcial training set will lack the details encountered in the ﬁeld. (3) Because the ESN is an approximation function, its quality is highly dependent on the size of the dynamic reservoir (DR). In the case of soil moisture and temperature a DR of 50 neurons suﬃces for anomaly detection. However, given a diﬀerent set of constraints the DR might not be large enough to encode the dynamics of the underlying modality.

6

Conclusion

This paper uniﬁes fault and event detection in sensor networks under the general framework of anomaly detection. We show that online anomaly detection is feasible on mote-class devices by implementing an Echo State Network (ESN) on a TelosB mote. This network performs as well as a PC-based ESN of the same size, proving that it is feasible to implement sophisticated pattern recognition algorithms on motes. Indeed, the ESN is small and fast enough to function alongside an environmental monitoring application, detecting measurement anomalies in real-time. Depending on the amplitude of the injected anomalies, the ESN provides equivalent or higher detection accuracy compared to rule-based detectors customized to speciﬁc faults. However, the most signiﬁcant feature of the ESN detector is its generality since it is capable of detecting all features not present in the training set. In our future work we will explore the feasibility of implementing other machine learning techniques, such as Bayesian Networks, on mote-class devices

Mote-Based Online Anomaly Detection Using Echo State Networks

85

and compare their performance to ESNs. With diﬀerent methods available, the challenge becomes how to choose the best supervised learning method for motebased online classiﬁcation when given a particular training set from the domain scientists.

References 1. Mus˘ aloiu-E.R., Terzis, A., Szlavecz, K., Szalay, A., Cogan, J., Gray, J.: Life Under your Feet: A WSN for Soil Ecology. In: EmNets Workshop (May 2006) 2. Selavo, L., Wood, A., Cao, Q., Sookoor, T., Liu, H., Srinivasan, A., Wu, Y., Kang, W., Stankovic, J., Young, D., Porter, J.: LUSTER: Wireless Sensor Network for Environmental Research. In: ACM SenSys. (November 2007) 3. Tolle, G., Polastre, J., Szewczyk, R., Turner, N., Tu, K., Buonadonna, P., Burgess, S., Gay, D., Hong, W., Dawson, T., Culler, D.: A Macroscope in the Redwoods. In: ACM SenSys. (November 2005) 4. Gupchup, J., Sharma, A., Terzis, A., Burns, R., Szalay, A.: The Perils of Detecting Measurement Faults in Environmental Monitoring Networks. In: DCOSS (2008) 5. MANA: Monitoring remote environments with Autonomous sensor Network-based data Acquisition systems, http://mana.escience.dk/ 6. Jaeger, H.: The echo state approach to analysing and training recurrent neural networks. Technical Report GMD Report 148, German National Research Center for Information Technology (2001) 7. Omitaomu, O.A., Fang, Y., Ganguly, A.R.: Anomaly detection from sensor data for real-time decisions. In: Sensor-KDD, Las Vegas, Nevada, USA (August 2008) 8. Wu, E., Liu, W., Chawla, S.: Spatio-temporal outlier detection in precipitation data. In: Sensor-KDD, Las Vegas, Nevada, USA (August 2008) 9. Kaplantzis, S., Shilton, A., Mani, N., Sekercioglu, A.: Detecting selective forwarding attacks in wsn using support vector machines. In: ISSNIP (2007) 10. Rashidi, P., Cook, D.J.: An adaptive sensor mining framework for pervasive computing applications. In: Sensor-KDD, Las Vegas, Nevada, USA (August 2008) 11. R¨ omer, K.: Distributed mining of spatio-temporal event patterns in sensor networks. In: EAWMS at DCOSS (June 2006) 12. Werner-Allen, G., Lorincz, K., Johnson, J., Lees, J., Welsh, M.: Fidelity and yield in a volcano monitoring sensor network. In: OSDI (2006) 13. Pister, K.: Tracking vehicles with a UAV-delivered sensor network. (March 2001), http://robotics.eecs.berkeley.edu/~pister/29Palms103/ 14. Hu, W., Tran, V.N., Bulusu, N., Chou, C.T., Jha, S., Taylor, A.: The design and evaluation of a hybrid sensor network for cane-toad monitoring. In: IPSN (2005) 15. Sharma, A., Golubchik, L., Govindan, R.: On the Prevalence of Sensor Faults in Real-World Deployments. In: IEEE SECON (2007) 16. Obst, O., Wang, X.R., Prokopenko, M.: Using echo state networks for anomaly detection in underground coal mines. In: IPSN (April 2008) 17. Wang, X.R., Lizier, J.T., Obst, O., Prokopenko, M., Wang, P.: Spatiotemporal anomaly detection in gas monitoring sensor networks. In: Verdone, R. (ed.) EWSN 2008. LNCS, vol. 4913, pp. 90–105. Springer, Heidelberg (2008) 18. Cornou, C., Lundbye-Christensen, S.: Classifying sows’ activity types from acceleration patterns. Applied Animal Behaviour Science 111(3-4), 262–273 (2008) 19. Bokareva, T., Bulusu, N., Jha, S.: Learning sensor data characteristics in unknown environments. In: IWASN (2006)

86

M. Chang, A. Terzis, and P. Bonnet

20. Chang, M., Terzis, A., Bonnet, P., http://www.diku.dk/~marcus/esn/ 21. Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H.: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000) 22. Marra, S., Iachino, M., Morabito, F.: Tanh-like activation function implementation for high-performance digital neural systems. Research in Microelectronics and Electronics 2006, Ph. D, 237–240 (June 2006) 23. Jaeger, H.: Tutorial on training recurrent neural networks, covering bppt, rtrl, ekf and the echo state network approach. Technical Report GMD Report 159, German National Research Center for Information Technology (October 2002) 24. Polastre, J., Szewczyk, R., Culler, D.: Telos: Enabling Ultra-Low Power Wireless Research. In: IPSN/SPOTS (April 2005) 25. Moteiv Corporation: Tmote Sky, http://www.moteiv.com/ 26. Jaeger, H.: Matlab toolbox for ESNs, http://www.faculty.jacobs-university. de/hjaeger/pubs/ESNtools.zip (Last checked: 2008-08-31) 27. Mackey, M.C., Glass, L.: Oscillation and Chaos in Physiological Control Systems. Science 197(287) (1977) 28. Wolfram Research, Inc.: Quantile-Quantile Plot, http://mathworld.wolfram.com/ Quantile-QuantilePlot.html 29. Chang, M., Terzis, A., Bonnet, P.: Mote-based online anomaly detection using echo state networks. Technical report, U. Copenhagen (2009), http://www.diku. dk/OLD/publikationer/tekniske.rapporter/rapporter/09-01.pdf

Adaptive In-Network Processing for Bandwidth and Energy Constrained Mission-Oriented Multi-hop Wireless Networks Sharanya Eswaran1, Matthew Johnson2 , Archan Misra3 , and Thomas La Porta1 1

Networking and Security Research Center, Pennsylvania State University 2 The Graduate Center, City University of New York 3 Advanced Technology Solutions, Telcordia Technologies

Abstract. In-network processing, involving operations such as ﬁltering, compression and fusion, is widely used in sensor networks to reduce the communication overhead. In many tactical and stream-oriented wireless network applications, both link bandwidth and node energy are critically constrained resources and in-network processing itself imposes nonnegligible computing cost. In this work, we have developed a uniﬁed and distributed closed-loop control framework that computes both a) the optimal level of sensor stream compression performed by a forwarding node, and b) the best set of nodes where the stream processing operators should be deployed. Our framework extends the Network Utility Maximization (NUM) paradigm, where resource sharing among competing applications is modeled as a form of distributed utility maximization. We also show how our model can be adapted to more realistic cases, where in-network compression may be varied only discretely, and where a fusion operation cannot be fractionally distributed across multiple nodes.

1

Introduction

Many wireless sensor network (WSN) scenarios involve a set of long-running applications, operating over relatively low rates of discrete-event data, and are thus principally energy-constrained. Given that communication costs dominate computing costs [13] for relatively simple event-processing operations (such as averaging or ﬁnding the maximum of periodic temperature readings), in-network processing has been proposed as a means to increase the network operational

This research was sponsored by US Army Research laboratory and the UK Ministry of Defence and was accomplished under Agreement Number W911NF-06-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the oﬃcial policies, either expressed or implied, of the US Army Research Laboratory, the U.S. Government, the UK Ministry of Defense, or the UK Government. The US and UK Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 87–102, 2009. c Springer-Verlag Berlin Heidelberg 2009

88

S. Eswaran et al.

lifetime by reducing the volume of data transmitted to the sink (e.g., [9]). In this approach, an application is modeled as a graph of stream operators, overlaid on the physical wireless network topology. Our focus is on a slightly diﬀerent stream-oriented wireless networking scenario, where several of these implicit assumptions do not hold. In particular, many military applications involve the use of a multi-hop wireless network (comprising non-sensor nodes) for transporting relatively high-data rate streams from a set of sophisticated sensor sources (e.g., video cameras, acoustic arrays and short-range radar feeds) for use by relatively shorter-duration tactical applications (often called missions). For such environments, bandwidth is a critical shared resource, and congestion control algorithms (e.g., [17]) must be employed to eﬀectively share the wireless link bandwidth among the competing missions. Moreover, the in-network operators for such stream-oriented data typically comprise more sophisticated DSP-based operations (e.g., MPEG compression or wavelet coeﬃcient computation), for which the computational cost cannot be ignored [5]. Accordingly, the application of in-network processing to such sensor-based streaming applications must consider both bandwidth and energy constraints and recognize that the energy cost consists of both communication and computing overheads. In the generalized model that we consider here, in-network processing may be viewed as a tuning knob, with higher levels of in-network processing (e.g., higher compression or coarser quantization) resulting in higher information loss for (or lower utility to) the application, but providing the beneﬁt of reduced network bandwidth consumption. This introduces a non-linear tradeoﬀ in the energy costs – in general, higher-levels of processing (e.g., more sophisticated compression techniques) lead to reduced transmission energy overheads, but a not-necessarily proportional increase in the computational energy [14]. In this paper, we ﬁrst introduce and develop a distributed, closed-loop control framework that computes the optimal level of compression performed by a forwarding node on sensor streams, taking into account both energy and bandwidth constraints. In particular, we extend the Network Utility Maximization (NUM) paradigm, pioneered in [1,2], to model resource sharing among competing sensor-based applications as a form of distributed utility maximization. Initially, the physical location of the stream operators is assumed to be pre-speciﬁed. Subsequently, the physical location of the operator graph components is treated as another decision variable, i.e., we enhance our optimization model to additionally determine the nodes where various in-network operations are performed. We shall show how our technique can capture more realistic scenarios where the quality of in-network processing may be varied only in discrete steps, and where an operator may be instantiated only on a single node. Simulation-based studies, using a packet-level protocol implementation of our algorithms, are then used to demonstrate how “adaptive operator placement” and “variable-quality in-network compression” can together result in a signiﬁcant improvement (as much as 39%) in overall mission utilities.

Adaptive In-Network Processing

89

The rest of this paper is organized as follows. In Section 2, we explain the unique aspects of our problem; Section 3 brieﬂy summarizes related work. Section 4 presents the mathematical model and protocol for the case where the location of the operators are speciﬁed a priori. Subsequently, Section 5 extends the solution to consider the problem of optimal placement of operators; Section 6 describes extensions to the the base algorithms to incorporate the real-life integral constraints. In Section 7, we present our simulation results. Finally, Section 8 concludes the paper.

2

The General Framework of Variable-Quality In-Network Processing and Dynamic Operator Placement

We consider two logically distinct in-network processing operations, compression and fusion: Compression: The downstream transmission rate of most stream-oriented data can be reduced by the application of appropriate compression algorithms, both lossless and lossy. For example, an MPEG-4 (or higher standards, such as MPEG21) video stream can be compressed to varying data rates. Compression may be performed independently at every forwarding node; conceptually, compression changes the quality (rate) of the output data, but not the data type. Fusion: In contrast to compression, fusion may be viewed as a process of either combining or correlating data from multiple separate streams (e.g., superimposition of audio and video feeds) and/or altering the ‘type’ of a single data stream. An example of ‘type’ alteration involves the processing of an audio stream to extract only the ‘talk spurts’ from the signal. We thus deﬁne an operator graph as a set of fusion operators. An operator placement algorithm maps each of the nodes of the operator graph to a subset of the forwarding nodes in the network; compression may then be viewed as an implicit data reduction operator permitted at any of the physical nodes lying between two consecutive components of the ‘logical’ operator graph. The problem of resource-aware in-network processing was studied in [12], where each individual operator was assumed to be immutable (each operator being characterized by a ﬁxed ratio between its output and input data rates) and mapped to a pre-deﬁned forwarding node. Separately, [5] considered the communication+computing cost constraint in the absence of in-network processing. An obvious extension of these frameworks is to allow the placement of the stream processing operators to be a decision variable as well. Prior work on fusion operator placement (such as [6,7,8,9,10,11,12]) treats it as a stand-alone problem, where the objective is to place more selective operators closer to the data sources, without considering the interaction with variable data compression performed at intermediate nodes. Prior work (such as [5]) also assumes a relatively simple scalar relationship between both computational and communication energy overheads and the incoming stream data rate. Many compression algorithms are, however,

90

S. Eswaran et al.

characterized by a non-linear energy-vs-compressibility curve, with the energy required for compression increasing dramatically when the ratio of output to input data rates falls below a certain threshold [18]. Based on the above discussion, the key new aspects of our problem formulation can be summarized as follows: 1. We consider the impact of variable quality compression of sensor streams, potentially performed by all forwarding nodes, on the capacity constraints and factor in the non-linear relationship between computational and communication energy overheads. 2. We also explicitly factor in the eﬀect of such variable quality compression on the operator placement problem, and develop a solution that jointly selects both the location of fusion operators and the degree of compression that maximize cumulative system utility. To solve this problem, we shall develop a NUM-based optimization framework and a fully-distributed protocol that seeks to jointly optimize the following free variables: i) Source Rate, x: the rate at which each sensor source transmits data, ii) Compression Factor, l: the level of compression, i.e., ratio of output rate to incoming rate, taking place at each forwarding node, and iii) Operator placement: the optimal node locations at which fusion operations take place.

3

Other Related Work

The classical NUM framework [1,2] was recently extended in [17] to a more general WSN environment, where individual missions derive their utility from a composite set of sensors, and intermediate nodes use link-layer multicast to forward sensor data downstream to multiple subscribing missions. In this WSN-centric model (referred to as WSN-NUM), the optimization problem is formulated as: xs maximize Um (Xm ) subject to ≤ 1, ∀q ∈ Q, ck,s m∈M

∀(k,s)∈q

where q is one of the set (Q) of all maximal cliques in the conﬂict graph; Um (Xm ) represents the utility function of mission m (M being the set of all missions) as a function of the vector of rates associated with the set of sensors S, and ck,s is the transmission rate used by node k during the link-layer broadcast of the data from sensor s. Based on this new model, a sensor (source) s adapts its rate as: d xs (t) = κ( dt

m∈Miss(s)

wms (t) − xs (t)

∀q∈P ath(s) ∀(k,s)∈q

μq (t) ) ck,s

(1)

where μq (t) (the ‘cost’ per bit charged by each forwarding clique) is given as (t) μq (t) = ( ∀(k,s)∈q xcsk,s − 1 + ε)+ /ε2 . Each mission (we assume that all the streams for a single mission are destined to a single ‘sink’ node) adapts its ‘willingness to pay’ term wms for sensor s based on the source rates and its

Adaptive In-Network Processing

91

Table 1. Most Common Mathematical Symbols M S set(m) Miss(s)

Set of all missions Total number of sources Set of sources used by mission m Set of missions using flow s (directly or fused) P ath(s) Multicast route for flow s (raw or fused) from its source to Miss(s) cks Transmission rate at node k for flow s xrec Received rate for flow s at a mission s (k, s) The transmission of flow s at node k xin (s, k) Incoming rate at node k for flow s xout (s, k) Outgoing rate for flow s from node k

k Pmax k Precv

Max power budget at node k. Power consumed at node k by data reception k Ptrans Power consumed at node k by data transmission k Pcomp Power consumed at node k by data processing k k k k Ptot Precv + Ptrans + Pcomp αk recv , Power consumed per bit of received, αk trans , transmitted and αk comp compressed data at node k lk,s Compression factor at node k for flow s

m own utility function Um (.), according to wms (t) = xs (t) ∂U ∂xs . The cost at each clique is cumulatively added along the forwarding nodes and piggy-backed with the data. The missions send this cost and willingness-to-pay as feedback to their sources. Each source uses this information to determine its rate for the next iteration, according to Eq. (1). This notion of utility-based adaptation under in-network stream processing was ﬁrst explored in [15] for wired networks, where each sensor ﬂow is assumed to pass through an arbitrary processing graph, with each operator on the graph performing a fixed fractional reduction (or increase) in the output rate. In the absence of any constraints on the total power consumption at a node, the problem of optimal in-network processing and rate adaptation decomposes into the multi-rate multicasting problem. This problem was studied for multi-hop wireless networks in [16], where a back-pressure based solution was developed. A data gathering algorithm with tunable compression for communication and computation eﬃciency was developed in [18], but it did not consider the aspects of joint utilities, congestion control and operator placement.

4

The Network Model and the Optimization Problem

We ﬁrst explain the process by which nodes select the optimal level of stream compression, assuming that the positions of the components of the operator graphs are pre-speciﬁed. 4.1

Assumptions

Our formulation and solution makes the following assumptions: (i) Each sensor’s data ﬂows over a pre-deﬁned multicast tree to its set of subscribing sink nodes (each sink representing a mission). (ii) A fused stream cannot be subsequently disaggregated; accordingly fusion of two streams at a node is possible only if all downstream subscribers (for each of the two sensors) require the same fused information. (iii) Each sensor’s ﬂow is completely elastic, i.e., each node can adjust its transmission rate xs by any arbitrary amount, as long as xs > 0. (iv) The computational power required for compression increases with decrease in

92

S. Eswaran et al.

the compression factor (i.e., ratio of transmitted rate to incoming rate). (v) A fusion or compression operation performed by an intermediate node is applied identically to the ﬂow on each of the outgoing links. 4.2

The Model

Each mission’s utility is modeled as a joint function of the rate that it receives from multiple sensors. The utility of a mission m is a function of the rate at which rec it receives data, denoted as Um ({xrec is the received rate s }s∈set(m) ), where xs of ﬂow s and set(m) is the set of sensors that are sources for m. U (.) is assumed to be a jointly-concave function of the rates of all incoming ﬂows. Table (1) lists the common mathematical symbols used in this paper. The key feature of our model is to permit each intermediate node to perform a ‘variable level of compression’, denoted as lk,s (where 0 < lk,s ≤ 1), that eﬀectively alters the rate of a ﬂow that is transmitted at node k and originated at source s. lk,s determines the ratio of the outgoing ﬂow rate to the incoming (s,k) ﬂow rate for sensor s at node k, i.e., lks = xxout . The variable compression level in (s,k) l eﬀectively acts as a ‘tuning knob’, allowing a forwarding node to modify the outgoing data rate in a manner that balances its competing computational and communication energy costs, and satisﬁes the capacity constraints. Intuitively, a congested network beneﬁts from more aggressive compression. Conversely, a network operating at low link utilization should have little need for compression unless its transmission energy cost is too high. The centralized model for this problem of utility maximization with adaptive in-network processing can be written as: NUM-INP(U,C,P): k maximize Um ({xrec Ptot , subject to (2) s }s∈set(m) ) − δ ∀nodes,k

m∈M

i) Capacity Constraint:

∀(k,i)∈q

xout (i, k) ≤ 1, ∀q ∈ set of cliques, Q (3) cki

k k ii) Energy Constraint: Ptot ≤ Pmax , ∀nodes, k

where

k Ptot

=

k Prec

+

k Ptrans

+

k Pcomp ,

(4)

0 ≤ δ ≤ 1 and xs ≥ 0 ∀s

The objective is to maximize utility of all missions, subject to an the total k “energy” penalty function δ ∀nodes,k Ptot , which ensures a unique solution by creating a convex optimization objective. δ (between 0 and 1) determines the weightage given to power consumption (vs. utility); in general, the penalty funck tion can be the sum of any convex functions of Ptot . The capacity and energy constraints are explained as follows: Capacity Constraint: The capacity constraint in Eq. (3) states that the total air-time fractions of all interfering transmissions (i.e., all transmissions in a maximal clique of the conﬂict graph) must not exceed unity. Please see [17] for further details. Energy Constraint: The energy constraint in Eq. (4) states that the total power k k consumed at a node k due to data reception (Precv ), transmission (Ptrans ) and

Adaptive In-Network Processing

93

k computation including both compression and fusion (if a fusion node) (Pcomp ) k must not exceed the maximum power budget at node k (Pmax ). As is common in literature [3,4,5], we assume a linear energy model as follows: k k Precv = αkrecv xin (s, k); Ptrans = αktrans xout (s, k); ∀ﬂows, s

at

k

∀f lowss

k If k is not a fusion point: Pcomp = αkcomp

∀ﬂows, s

at k

xin (s, k)( at

k

1 − 1); lks

where 0 < lks ≤ 1. If k is a fusion point, there is an additional computational cost (f,k) of αkcomp ∀f lows,f f used at k xoutlkf incurred by the fusion process. Without loss of generality, we assume that this cost is proportional to the rate of the fused ﬂow, and that the cost per bit is the same for compression and fusion. 4.3

Distributed Solution to the Optimization Problem

In order to solve this optimization problem in a distributed manner, we derive an iterative, gradient-based solution for the model shown in Eq. (2)-(4). We ﬁrst make the problem unconstrained by taking Lagrangian as shown below: k maximize Um ({xrec Ptot − s }s∈set(m) ) − δ m∈M

∀cliques,q

μq (

∀(k,s)∈q

xout (s, k) − 1) − cks

∀nodes,k k k ηk (Ptot − Pmax )

∀nodes,k

where μq and ηk are Lagrangian multipliers. Using the ﬁrst-order necessary conditions for gradients with respect to xs and lk,s , we get the following equations: d ∂Um xs (t) = κxs ( − μq dt ∂x s m∈M iss(s) ∀q∈P ath(s)

∀(k,s)∈q

d ∂Um lk,i (t) = κlk,i ( − μq dt ∂l k,i m∈M iss(i) ∀q∈P ath(i)

k ∂xout (s, k) ∂Ptot − (ηk + δ) )(5) ∂xs Cks ∂x s ∀k∈P ath(s)

∀(v,i)∈q

v ∂xout (i, v) ∂Ptot − (ηv + δ) )(6) ∂lk,i Cvi ∂l k,i ∀v∈P ath(i)

where, μq is deﬁned as the shadow cost of congestion charged at each clique q (t) and is given by μq (t) = ( ∀(k,s)∈q xcsk,s − 1 + )+ /δ1 Similarly, ηk is the shadow cost of energy charged at each node k and is given P k (t) by ηk (t) = ( P ktot (t) − 1 + )+ /δ2 where δ1 and δ2 are constants greater than max 0. and (0 ≤ , ≤ 1) determine the tolerance margin [17]. Eq. (5) provides the algorithm by which the source sensors adjust their rates at each iteration; Eq. (6) shows how at each node, the degree of compression for each ﬂow that the node forwards is varied in each iteration. We observe the following: (i) Source rate xs depends on the rates at which the downstream

94

S. Eswaran et al.

nodes forward either this source’s ﬂow directly (when there is no fusion), or any ﬂow derived from this source’s ﬂow (when there is fusion). Similarly, it also depends on the power consumed at all downstream nodes that forward either the source’s direct ﬂow or a ﬂow derived (via fusion) from this source. (ii) The compression levels at the forwarding nodes depend on the forwarding rates and power consumption at all downstream nodes that receive this ﬂow (either raw or fused). When the source and forwarding rates are independently adjusted according to Eq. (5) and (6), the network converges at the optimal global utility, with penalties paid for congestion and power consumption. Please see [19] for proof. 4.4

Protocol-Level Implementation of the NUM Algorithm

The biggest challenge in building a fully-distributed and localized protocol for this model arises from the presence of fusion operators at speciﬁc intermediate WSN nodes. The stream that a mission receives is now obtained by fusing one or more ﬂows from set(m) according to a series of operators, as deﬁned by the operator graph. An individual operator f can be viewed as a function that takes as input the rates of the ﬂows to be fused, and gives as output the rate of the resulting fused ﬂow. Hence, the utility of a mission m is a joint function of rates xrec i , ∀i ∈ set of f lows received at m, with some of these ﬂows being ‘raw’ ﬂows (potentially compressed) from the corresponding sensor, and other ﬂows being ‘derived’, through the application of a fusion operator at intermediate nodes (which act as the ‘source’ for the derived ﬂow). While Eq. (5) refers only to rate adjustment at the ‘raw’ sources (i.e., sensors), the ﬂow i in Eq. (6) may refer to either a raw or derived ﬂow. Hence the distributed formulation in Eq. (5) and (6) is suﬃcient for deriving the optimal rates for both ‘raw’ and ‘derived’ ﬂows. From a protocol-perspective, however, the end-to-end feedback mechanism used in [17], whereby the sinks simply convey their willingness to pay directly to the source sensors, needs to be modiﬁed to reﬂect the inability of a sink to directly compute its ‘willingness to pay’ for a source that has passed through intermediate fusion points. For example, if a stream from source s is transformed

Fig. 1. Node A Fig. 2. Feedback messages received Fig. 3. Computation of fuses ﬂows r, s; and propagated by A lA,f according to Eq. (6) transmits fused ﬂow f

Adaptive In-Network Processing

95

twice by operators f and g before reaching a mission m, the mission is unable m to compute its marginal utility ∂U ∂xs , because all it knows is the rate of the stream of type “g • f ”, which contributes to its utility; it is unaware of both the source rate of s and the details of the fusion operations f and g. Here g • f refers to the composition function of the form g(f (xs , ...)). The solution in m this case is to use the “chain rule” for partial derivatives and compute ∂U ∂xs as

(xs ,...)) ∂f (xs ,...) ∗ ∂g(f , where the fusion point for g and f provide ∂f (xs ,...) ∗ ∂xs the second and third terms, respectively. Accordingly, in our NUM-INP protocol, the forward path carries only the data, but no meta-data. Nodes propagate the marginal utility, congestion cost and energy cost as metadata in signaling messages carried on the reverse forwarding path; nodes use these feedback messages to compute the compression levels and the source rates for the next iteration, in addition to updating and propagating them upstream. For each stream r that a mission m receives, it sends a feedback (periodically), to the node that forwarded this stream. The feedback message consists of: i) A marginal utility MU ﬁeld, where the mission enters its marginal utility with respect to the received ﬂow rate ( ∂x∂U rec ); this is used for computing the s ‘willingness-to-pay’ according to the chain rule. ii) A 4-tuple consisting of the ﬁelds flow name (the ID of the ‘ﬂow’), rate information (RI) (the rate at which the mission receives the ﬂow, power information (PI) (the energy cost attributed to this ﬂow) and congestion information (CI) (the normalized congestion cost at all the cliques that this node belongs to). If an intermediate node was a branching point on the multicast forwarding tree, it collects the feedback from all its child nodes and combines them into a single feedback message. The cost ﬁelds are updated at each node in the reverse path, to compute the cumulative cost along the path, and the fusion points make additional modiﬁcations to capture the eﬀect of fusion operation (according to the chain rule). For example, when a forwarding node A receives a feedback message for ﬂow f from a downstream node, it adds its own energy cost for f to the PI ﬁeld (i.e., A P I = P I + (ηA + δ)Ptot (f )) and its own congestion cost for f to the CI ﬁeld (f,A) (i.e., CI = CI + ∀q:(A,f )∈q μq xout ) before passing the feedback message CA,f to its upstream neighbor. If A is also the fusion point where the fused ﬂow f originates, then all the ﬁelds in the table are further multiplied by the term lA,f ∂f xout (f,A) ∗ xin (s, A) ∗ ∂xin (s,A) , before propagating the feedback upstream. Using the meta-data in the feedback message, the forwarding nodes and source nodes compute the compression levels and source rates for the next iteration, according to Eq. (5) and (6). Fig. (1)-(3) illustrate the propagation of feedback and computation of compression level for a simple example. In Fig. (2), A v = r in the feedback to r and v = s in the one to s; p1 = (ηA + δ)Ptot (f ), xout (f,A) c1 = ∀q:(A,f )∈q μq CA,f . ∂Um ∂g(f (xs ,...))rec

96

5

S. Eswaran et al.

Adaptive Operator Placement

In the previous section, we assumed that the locations of the fusion operators are ﬁxed and given a priori. In this section, we describe how the NUM-INP framework can be enhanced to additionally determine the optimal placement of the fusion operators. Ideally, the communication cost is lowest if a fusion operation takes place as close to the sources as possible. However, due to energy constraints, nodes closer to the source may not be able to perform the fusion operation; in such situations, higher utility may be obtained by pushing the operator to a node downstream. Our approach is to integrate operator placement into the NUM framework (in parallel to source rate adaptation and adaptive compression quality), albeit as an “outer” optimization loop that occurs at a slower time-scale. With the help of an operator graph, the forwarding trees and the mission subscription information, the nodes in a network can determine if they are candidate-locations for a fusion operator. For example, for the simplistic network shown in Fig. (4), where mission M requires the fused ﬂow, f (xs1 , xs2 ), the fusion can take place at node A or B or C. We assume that each node runs a preliminary protocol (details of which are not relevant to this work) to determine which fusion operations can be performed at that node. We also assume that the fusion operations can be expressed as functions of the rates of their input ﬂows. Our approach is to allow all candidate locations to perform fusion on an arbitrary fraction of the input streams, and transmit the rest as raw streams. This fraction is variable and is adjusted iteratively in a NUM-based control loop, and it converges at the optimal value. Let k be a representative candidate node for the fusion operation f (xs1 , xs2 , xs3 , ...xsn ) that fuses ﬂows F = {s1 , s2 , s3 , ..., sn }. Let k θf,s (where si ∈ F ) be the fraction (lying between 0 and 1) of the i input ﬂow si that is fused at node k. The rest of the input ﬂow is passed on downstream, where the next candidate node fuses all or a Fig. 4. fraction of it, and so on. The mission sink is always a candidate for Example all fusion operators, and can absorb any residual “unfused” stream network data. A For the example shown in Fig. (4), node A fuses according to f (θf,s1 xs1 , A A θf,s2 xs2 ) and forwards input ﬂows s1 , s2 and the fused ﬂow, f at rates lA,s1 (1 − A A A A θf,s )xs1 , lA,s2 (1 − θf,s )xs2 and lA,f f (θf,s x , θf,s x ), respectively, where lk,s 1 2 1 s1 2 s2 refers to the compression factor for ﬂow s at node k. Subsequently, node B forA B wards the input ﬂows at rate lB,s lA,s (1 − θf,s )(1 − θf,s )xs , where s ∈ {s1 , s2 }, along with ﬂow f A (i.e., ﬂow fused at A) compressed at lB,f . It also forwards A B the new ‘sub-ﬂow’ f B fused at B at rate lB,f f (lA,s1 (1 − θf,s )θf,s x , lA,s2 (1 − 1 1 s1 A B θf,s2 )θf,s2 xs2 ). If the optimal value of θ after convergence is 1 at a node, then that node is the unique optimal location for fusion. It is also possible that the optimal conﬁguration is for multiple nodes to share the responsibility of fusion (i.e., two or more of the candidate nodes will have 0 < θ < 1). Such ‘fractional fusion’ can

Adaptive In-Network Processing

97

be interpreted as a process of “time-sharing” the responsibility of fusion across the candidate nodes. The generic model in Eq. (2)-(4) holds for this problem too; the source rates and compression factors continue to be adjusted according to Eq. (5) and Eq. (6), respectively. By taking the Lagrangian of the “θ-enhanced” NUM objective, we d k derive the θ-adjustment algorithm for a fusion operation op to be: dt θop,s = k

κθop,s (

m∈M iss(s)

∂Um − μq k ∂θop,s ∀q∈P ath(s)

∀(v,s)∈q

∂xout (s, v) ∂P v − (ηv + δ) ktot ) (7) k ∂θop,s Cvs ∂θ op,s ∀v∈P ath(s)

We observe from Eq. (7) that the θs at candidate fusion points depend on the forwarding rates and power consumption at all downstream nodes that receive the ﬂows, either directly or after fusion, from this node. It must be noted that in this problem, the values of xin , xout , as well as the nodes in the sets path(i) must now be computed depending on the values of θ’s. We prove in [19] that this algorithm converges at the optimal solution. 5.1

Protocol-Level Modifications for Operator Placement

The introduction of adaptive operator placement requires modiﬁcations to the signaling mechanism along the reverse forwarding path. This is because, a mission subscribing to a fused ﬂow now receives multiple ‘sub-ﬂows’, each fused at a diﬀerent candidate location, along with the original ﬂows (to be fused directly at the mission). Hence, the feedback message now consists of a table of 4-tuples, called the Feedback Information Table (FIT), instead of a single entry. The ﬁelds in the 4-tuple remain the same as described in Section 4.4 and there is an entry (row) in FIT corresponding to each sub-ﬂow received at the mission. The nodes along the reverse-forwarding path update the cost information for each of the sub-ﬂows, and the fusion-point for each sub-ﬂow is responsible for augmenting the meta-data with the chain-rule information. In order to reduce the signaling overhead, we maintain a special row in FIT, called the cumulative entry for each original ﬂow (i.e., each input to the fusion operation); at each candidate fusion point, the meta-data in the row corresponding to its sub-ﬂow is added to the cumulative entries and the row is removed. Thus as the feedback message propagates upwards, the FIT reduces in size, with all its entries eventually collapsing to the cumulative rows. In the example network of Fig. (4), mission m receives ﬂows fused at A, B, C and also the raw streams s1 and s2 (if the fusion points do not fuse all the data). ∂Um Hence, m sends feedback to C with marginal utility as ∂(x A +x B +x C +f (xs1 ,xs2 )) f

f

f

(where xf k refers to the rate of ﬂow of type f that is fused at node k), and FIT with ﬁve rows, corresponding to s1 , s2 , f A , f B and f C . When C receives this message, it does the following: (i) updates congestion and energy cost for all the sub-ﬂows, (ii) adds the rate and cost information for f C to the corresponding ﬁelds in the cumulative entry and (iii) removes row f C . Subsequently nodes B and A update the message in a similar fashion, such that the feedback that

98

S. Eswaran et al.

arrives at source s1 consists of only two rows in FIT: s1 and cumulatives1 (and similarly for s2 ). Please see [19] for a more detailed example. The forwarding nodes use the feedback message to compute the θ and compression values for the next iteration, and the source nodes compute the new ﬂow rates. The pseudo-code for this adaptation process is given in [19]. We note that only minimal amount of information is signaled and the algorithms have been devised such that Eq. (5, 6, 7) can be computed precisely from just this minimal meta-data and locally available information.

6

NUM Modifications to Address Practical Constraints

For mathematical tractability, the NUM-based technique for “optimal” variable in-network compression and operator placement requires both these processes to be represented as continuous variables. These assumptions are likely to be violated in practice. We now describe how the NUM algorithm can be modiﬁed to address both these practical limitations. Discrete Compression Levels: Most of the commonly used compression techniques provide for multiple, but discrete, compression levels. For instance, gzip provides 9 levels of compression, JPEG allows a range of 0 to 100 levels, and MP3 allows compression ratios ranging from 12:1 to 10:1. The discontinuity arising from such integral choices prevents the direct application of NUM’s gradient search techniques and in fact, makes the problem NP-hard [19]. Our NUM- based heuristic is to run the protocols using a continuous compression model, but simply map the computed lk,s value to the nearest valid discrete compression level at each iteration. Solitary Operator Location: Our theoretical model assumes that a particular fusion operator may be “split” (in diﬀerent fractions) across multiple nodes. In practice, many operators may not be conducive to such fractional splitting over inﬁnitesimal time-scales. In such cases, our heuristic solution is to assign the responsibility for fusion to the node with the “largest θ”. A heuristic based approach is required because the problem of determining the best single location for a fusion operator is an NP-hard combinatorial problem as well [19]. The selection of this single fusion point may be performed at each iteration of the NUM θ-loop (Eq. (7)). To achieve this, the highest cumulative θ value of downstream nodes is also propagated up the reverse forwarding path; the most upstream node among the fusion candidates can then designate the node with the most fusion responsibility as the sole fusion point. However, to ensure rapid convergence, the other terms (in the Feedback Information Table) carried in the signaling messages are based on the use of the ‘virtual’ continuous-θ values.

7

Evaluation

In this section we evaluate the performance of the NUM-INP protocol based on a packet-level simulation on an 802.11-based multi-hop wireless network, using the discrete-event simulator Qualnet [20]. The values of αkrecv , αktrans and

Adaptive In-Network Processing

99

αkcomp are taken as 0.75μJ/bit, 0.6μJ/bit and 0.54μJ/bit, based on the data from [14]. Utility Gain Due to In-network Processing: Fig. (5) illustrates the rates obtained from adaptive in-network compression on a sample simulated topology, where the ﬂows from sources 1 and 2 are fused at node 3 and the fused ﬂow is forwarded to missions A − H. The compression factor and transmission rate at each node, and the rate at which each mission receives data (xrec ) are shown in the ﬁgure. The utility of a mission is of the form γln(1 + xrec ). For missions A and B, γ = 100; for missions C and D γ = 20; for missions E and F γ = 1; for missions G and H γ = 0.25. As illustrated, in our model, missions that have higher utility receive the fused ﬂow at higher data rate. On the contrary, if there is no in-network compression, then all the missions receive at a uniform rate of 11.57 kbps. The values shown within parentheses are the compression factors and rates when only four discrete compression levels (0.25, 0.5, 0.75 and 1.0) are allowed. We observe that the rates with discrete compression are fairly close to the optimal values that can be achieved when the compression is a continuousvalued variable. Fig. (6) compares the utilities of a network under three cases: a) with only source rate adaptation (according to WSN-NUM) but no in-network compression, b) optimal variable quality compression with pre-speciﬁed fusion locations and c) with joint optimization of compression and operator placement. The simulated network consists of 100 nodes of random topology in a 1500m x 1500m ﬁeld. There are 25 missions and 25 sources and 15 fusion operations, whose initial locations are picked randomly from the sets of candidate locations (given by operator task graph). We can see that with NUM-INP, the global utility of the network is higher (by about 30%); the joint optimization of the operator locations results in a further 18% gain in system utility. Performance Scalability: Fig. (7) shows the percentage gain in utility achieved by NUM-INP protocol, compared to simple source rate adaptation (WSN-NUM), when the number of missions and sources in the network are varied. We see that the gain increases with an increase in the number of competing missions and sensor sources. We experimented with diﬀerent topologies and observed similar Fig. 5. Illustration of adaptive in-network results in all cases. The relative gain compression with continuous and discrete with in-network processing is higher levels when the number of missions is larger; adaptive in-network compression and fusion helps to alleviate congestion bottlenecks, while adhering to the energy consumption constraints. We also tested the signaling overhead for diﬀerent numbers

100

S. Eswaran et al.

of candidate nodes and fusion operations and the overhead was very low, in the order of tens of bytes per second. NUM-INP under “Realistic Constraints”: We study the impact of discrete compression levels by computing the loss in overall utility as a function of the number of discrete compression levels permitted. We map a compression factor value to a particular level, depending on how many levels are available. For example, when 10 levels of compression are allowed, we let level 1 = 0.1, level 2 = 0.2, and so on. Fig. (8) plots the system utility (normalized over the optimal utility with continuous compressibility). We see that the utility is at least 95% of the optimal for 10 or more number of discrete levels, but drops rapidly if the number of distinct compression levels is very small. Fig. (9) shows the normalized utility as a function of the number of fusion operators, when partial fusion is prohibited and fusion occurs at a solitary node (as described in Section 6). For each fusion operator, the number of candidate nodes was randomly chosen to be between 2 and 10. We see that the utility remains close to the optimal even as the number of in-network fusion operations is increased, with only at most 5% loss in system utility. By comparing this result to Fig. (6), where adaptive operator placement oﬀers an additional Fig. 6. Impact of in-network pro- 18% gain in utility, we see that joint optimizacessing tion of compression and operator placement is beneﬁcial, even if fractional operator placement is not permitted.

Fig. 7. Impact of number of Fig. 8. Impact of discrete Fig. 9. Impact of single compression levels node fusion missions and sources

8

Conclusion

In this work, we have developed a utility-based protocol for adaptive in-network processing, for wireless networks with streaming sensor sources, which maximizes

Adaptive In-Network Processing

101

the sum of mission utilities by jointly optimizing the source data rate, the degree of stream compression and the location of fusion operators. Our protocol can achieve up to 39% higher utility than pure source-rate adaptation, with only modest signaling overhead. In ongoing work, we are extending this framework to dynamically modify the level of in-network processing, taking network lifetime objectives into account.

References 1. Kelly, F.P., Maulloo, A.K., Tan, D.K.H.: Rate control for communication networks: shadow prices, proportional fairness and stability. JORS 49, 237–252 (1998) 2. Low, S.H., Lapsley, D.E.: Optimization ﬂow control,I: Basic algorithm and convergence. IEEE/ACM ToN 7, 861–874 3. Freeney, L.M., Nilsson, M.: Investigating the energy consumption of a wireless network interface in an ad hoc networking environment. In: Proc. of IEEE INFOCOM (April 2001) 4. Hou, Y.T., Shi, Y., Sherali, H.D.: Rate allocation in wireless sensor networks with network lifetime requirement. In: Proc. of ACM MobiHoc (May 2004) 5. Zhang, C., Kurose, J., Liu, Y., Towsley, D., Zink, M.: A distributed algorithm for joint sensing and routing in wireless networks with non-steerable directional antennas. In: Proc. of ICNP 2006 (2006) 6. Madden, S., Franklin, M., Hellerstein, J., Hong, W.: Tag: A tiny aggregation service for ad hoc sensor networks. In: ACM SIGOPS Operating Systems Rev., December 2002, pp. 131–146 (2002) 7. Bonﬁls, B., Bonnet, P.: Adaptive and decentralized operator placement for innetwork query processing. In: Zhao, F., Guibas, L.J. (eds.) IPSN 2003. LNCS, vol. 2634, pp. 47–62. Springer, Heidelberg (2003) 8. Ahmad, Y., Cetintemel, U.: Network-aware query processing for stream-based applications. In: Proc. of VLDB 2004 (2004) 9. Srivastava, U., Munagala, K., Widom, J.: Operator Placement for in-network stream query processing. In: Proc. PODS 2005 (2005) 10. Pietzuch, P., Ledlie, J., Shneidman, J., Roussopoulos, M., Welsh, M., Seltzer, M.: Network-aware operator placement for stream-processing systems. In: Proc. of ICDE (2006) 11. Abrams, Z., Liu, J.: Greedy is good: On service tree placement for in-network stream processing. In: Proc. of ICDCS 2006 (2006) 12. Ying, L., Liu, Z., Towsley, D., Xia, C.: Distributed Operator Placement and Data Caching in Large-Scale Sensor Networks. In: Proc. INFOCOM 2008, Phoenix, AZ (2008) 13. Sadler, C.M., Martonosi, M.: Data compression algorithms for energy-constrained devices in delay tolerant networks. In: Proc. of ACM SenSys, pp. 265–278 (2006) 14. Barr, K.C., Asanovi´c, K.: Energy-aware lossless data compression. ACM TOCS 24(3), 250–291 (2006) 15. Xia, C., Towsley, D., Zhang, C.: Distributed Resource Management and Admission Control of Stream Processing Systems with Max Utility. In: Proc. of the ICDCS, June 2007, pp. 68–75 (2007)

102

S. Eswaran et al.

16. Bui, L., Srikant, R., Stolyar, A.L.: Optimal Resource Allocation for Multicast Flows in Multihop Wireless Networks. In: Proc. of IEEE CDC (December 2007) 17. Eswaran, S., Misra, A., Porta, T.L.: Utility-Based Adaptation in Mission-oriented Wireless Sensor Networks. In: Proc. of IEEE SECON (June 2008) 18. Yu, Y., Krishnamachari, B., Prasanna, V.K.: Data Gathering with Tunable Compression in Sensor Networks. IEEE TPDS 19(2), 276–287 (2008) 19. Eswaran, S., Misra, A., La Porta, T.F.: Adaptive In-network Processing for Bandwidth and Energy Constrained Mission-oriented Wireless Sensor Networks. Technical Report, Dept. of CSE, Pennsylvania State University (October 2008) 20. http://www.qualnet.com

LazySync: A New Synchronization Scheme for Distributed Simulation of Sensor Networks Zhong-Yi Jin and Rajesh Gupta Department of Computer Science and Engineering University of California, San Diego {zhjin,rgupta}@cs.ucsd.edu

Abstract. To meet the demands for high simulation ﬁdelity and speed, parallel and distributed simulation techniques are widely used in building wireless sensor network simulators. However, accurate simulations of dynamic interactions of sensor network applications incur large synchronization overheads and severely limit the performance of existing distributed simulators. In this paper, we present LazySync, a novel conservative synchronization scheme that can signiﬁcantly reduce such overheads by minimizing the number of clock synchronizations during simulations. We implement and evaluate this scheme in a cycle accurate distributed simulation framework that we developed based on Avrora, a popular parallel sensor network simulator. In our experiments, the scheme achieves a speedup of 4% to 53% in simulating single-hop sensor networks with 8 to 256 nodes and 4% to 118% in simulating multi-hop sensor networks with 16 to 256 nodes. The experiments also demonstrate that the speedups can be signiﬁcantly larger as the scheme scales with both the number of packet transmissions and sensor network size.

1

Introduction

Accurate simulation is critical to the design, implementation and evaluation of wireless sensor networks (WSNs). Numerous WSN simulators have been developed based on event driven simulation techniques [1] and the ﬁdelities of WSN simulators are rapidly increasing with the use of high ﬁdelity simulation models [2,3,4,5]. In event driven simulations, ﬁdelity represents the bit and temporal accuracy of events and actions. Due to the need for processing a large number of events, high simulation ﬁdelity often leads to slow simulation speed [6] which is deﬁned as the ratio of simulation time to wallclock time. Simulation time is the virtual clock time in the simulated models [7] while wallclock time corresponds to the actual physical time used in running the simulation program. A simulation speed of 1 indicates that the simulated sensor nodes advance at the same rate as real sensor nodes and this type of simulation is called real time simulation. Typically, real time speed is required to use simulations for interactive tasks such as debugging and testing. To meet the demands for high simulation ﬁdelity and speed, most of latest WSN simulators are based on parallel and distributed simulation techniques B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 103–116, 2009. c Springer-Verlag Berlin Heidelberg 2009

Z. Jin and R. Gupta

Simulation Time (TS)

104

B.ReadChannel

TS1 TS0

Node B Node A

TW0 TW1 Wallclock Time (TW)

Fig. 1. The progress of simulating in parallel a wireless sensor network with two nodes that are in direct communication range of each other on 2 processors

[8,9,10,11]. WSN simulators can be broadly divided into two types: sequential simulators and parallel/distributed simulators. Sequential simulators simulate all the sensor nodes of a WSN in sequence on a single processor and therefore cannot beneﬁt from running on a multi-core processor or on multiple processors. Parallel and distributed simulators seek to improve simulation speed by simulating diﬀerent sensor nodes on diﬀerent cores or processors in parallel. A problem with existing distributed simulation techniques is the large overheads in synchronizing sensor nodes during simulations [10,6]. When sensor nodes are simulated in parallel, their simulation speeds can vary due to the diﬀerences in simulated nodes, such as diﬀerent sensor node programs, sensor node inputs and values of random variables, as well as the diﬀerences in simulation environments, such as diﬀerent processor speeds and operating system scheduling policies. Because of this, simulated sensor nodes need to synchronize with each other frequently to preserve causality of events and ensure correct simulation results [9,10,6]. For example, as shown in Fig. 1, two nodes in direct communication range of each other are simulated in parallel on two processors. After TW 0 seconds of simulation, Node B is simulated faster than Node A as indicated by the fact that the simulation time of Node B (TS1 ) is greater than the simulation time of Node A at TW 0 . At TS1 , Node B is supposed to read the wireless channel and see if there is an incoming transmission from Node A. However, after reaching TS1 at TW 0 , Node B cannot advance any further because at TW 0 Node B does not know whether Node A is going to transmit at TS1 or not. In other words, Node B cannot be simulated any further than TS1 until Node A reaches TS1 . There are two general approaches to handle cases like this: conservative [1] or optimistic [12]. The conservative approach works by ensuring no causality relationships among events are violated while the optimistic approach works by detecting and correcting any violations of causal relationships [6]. To the best of our knowledge, almost all distributed WSN simulators are based on the conservative approach as it is simpler to implement

LazySync: A New Synchronization Scheme for Distributed Simulation

105

and has a lower memory footprint. With the conservative approach, simulated sensor nodes need to synchronize their clocks and coordinate their simulation orders frequently. These tasks introduce communication overhead and management overhead to distributed simulations respectively [13]. As described in [10,6], the performance gains of existing distributed WSN simulators are often compromised by the rising overheads due to inter-node synchronizations. In this paper, we propose LazySync, a novel conservative synchronization scheme that can signiﬁcantly reduce the number of clock synchronizations in parallel and distributed simulations of WSNs. We validate our approaches by their implementations in PolarLite, a cycle accurate distributed simulation framework that builds upon Avrora and serves as the underlying simulation engine [6]. We discuss related work in Sect. 2. The LazySync scheme is presented in Sect. 3 and its implementations are described in Sect. 4. In Sect. 5 we present the results of our experiments followed by the conclusion and future work in Sect. 6.

2

Related Work

Prior work on improving the speed and scalability of WSN simulators can be divided into two categories. The ﬁrst category of work can be applied to both sequential and distributed simulators. It focuses on reducing the computational demands of individual simulation model without signiﬁcantly lowering ﬁdelity. For example, since emulating a sensor node processor is computationally expensive [4], TimeTOSSIM [5] automatically instruments applications at source code level with cycle counts and compiles the instrumented code into the native instructions of simulation computers for fast executions. This signiﬁcantly increases simulation speed while achieving a cycle accuracy of up to 99%. However, maintaining cycle counts also slows down TimeTOSSIM to about 1/10 the speed of non-cycle-accurate TOSSIM [2] which TimeTOSSIM is based on. Our work can make this type of eﬀort scalable on multiple processors/cores. The second category of work focuses on distributed simulators only. There is a large body of research on improving the speed and scalability of distributed discrete event driven simulators in general. Among them, use of lookahead time [1] is a commonly used conservative approach [14,15]. Our previous work [6,13] follows this direction. In [6], we describe a technique that monitors the duty cycling of sensor nodes and uses the detected sensor node sleep time to reduce the number of synchronizations. As demonstrated in the paper, using the nodesleep-time based technique can signiﬁcantly increase speed and scalability of distributed WSN simulators. However, the technique is only able to exploit the time when both the processor and radio of a sensor node are oﬀ for speedup because it is not possible to predict the exact radio wakeup time when the processor is running. In [13], we develop new techniques to address the limitation of [6]. By exploiting the radio wakeup latency and MAC backoﬀ time, the new techniques are eﬀective even when the processor or radio of a node is active.

106

Z. Jin and R. Gupta

Our LazySync approach is diﬀerent from the lookahead approach as we do not explicitly exploit lookahead time. As an alternative to the lookahead approach, the performance of distributed simulators can also be improved by reducing the overheads in performing synchronizations. DiSenS [10] reduces the overheads of synchronizing nodes across computers by using the sensor network topology information to partition nodes into groups that do not communicate frequently and simulating each group on a separate computer. However, this technique only works well if most of the nodes are not within direct communication range as described in the paper. LazySync is very diﬀerent from DiSenS in the sense that it works by reducing the total number of clock synchronizations in a simulation rather than by reducing the overhead of performing an individual clock synchronization. In addition, LazySync is particularly eﬀective in improving the performance of simulating dense WSNs with a large amount of communication traﬃc. LazySync is based on similar ideas as lazy evaluation which have been used in earlier work on very diﬀerent problems from architectural designs to programming languages [16]. Though conceptually similar, we show how the notion of lazy evaluation can be applied in sensor networks for improved performance of distributed simulations.

3

Lazy Synchronization Scheme

In distributed simulations, each sensor node is commonly simulated in a separate thread or process. To maximize parallelism, a running node should try to prevent other nodes from waiting by communicating its simulation progress to those nodes as early as possible (AEAP). If a node has to wait for other nodes due to variations in simulation speeds, the thread/process simulating the waiting node should be suspended so the released physical resources can be used to simulate some other non-waiting nodes1 . For maximum parallelism, suspended nodes need to be revived AEAP once the conditions that the nodes wait for are met. For example, Node A in Fig. 1 should synchronize with Node B immediately after it advances past TS1 to resume the simulation of Node B. The AEAP synchronization scheme is adopted by most existing distributed WSN simulators [9,10,6]. It is commonly implemented [9,6] by periodically sending the simulation time of every non-waiting node to all its neighboring nodes, which are nodes that are within its direct communication range. Ideally, the clock synchronization period should be as short as possible for maximum parallelism. However, due to the overheads in performing clock synchronizations [6], the synchronization period is commonly set to be the minimal lookahead time, which is the smallest possible lookahead time in the simulation. As mentioned, lookahead time is the maximum amount of simulation time that a simulated sensor node can advance freely without synchronizing with any other simulated sensor nodes [7]. For example, in the case of simulating Mica2 nodes [17], the minimal lookahead 1

For synchronization purposes, a non-waiting node refers to a node that is not waiting for any simulation events. It may still be ready, active or inactive in a given simulator.

Simulation Time (TS)

LazySync: A New Synchronization Scheme for Distributed Simulation

TS2

B.ReadChannel

107

A.ReadChannel

TS1 TS0

Node A

Node B Node C

TW0

TW1

TW2

Wallclock Time (TW) Fig. 2. The progress of simulating in parallel a wireless sensor network with three nodes that are in direct communication range of each other on 2 processors

time is the lookahead time of nodes with radios in the listening mode. It is equal to the amount of time to receive one byte over Mica2’s CC1000 radio [6] and is equivalent to 3072 clock cycles of the 7.32728MHz AVR microcontroller in Mica2. Therefore, when simulating a network of Mica2 nodes, every non-waiting node needs to send its simulation time to all its neighboring nodes every 3072 clock cycles. Depending on simulator implementations, once the simulation time is received by a neighboring node, some mechanisms will be triggered to save the received time and compute the earliest input time (EIT) [1]. EIT represents the safe simulation time that the neighboring node can be simulated to. If the neighboring node happens to be waiting, then it will also be revived if its EIT is not less than the wait time. To be revived AEAP, a waiting node commonly sends its waiting time to the nodes that it depends on before entering into the suspended state. By doing so, the depending nodes can send their simulation time to the waiting node immediately after they advance past the waiting time. 3.1

Limitations of AEAP Synchronization Scheme

While the AEAP synchronization scheme is sound in principle, its eﬀectiveness is based on the assumption that there is always a free processor available to simulate every revived node. However, this is generally not the case in practice as the number of nodes under a simulation is usually a lot larger than the number of processors used to run the simulation. As a result, the AEAP synchronization scheme may slow simulations down in many simulation scenarios by introducing unnecessary clock synchronizations. For example, Fig. 2 shows the progress of simulating in parallel 3 nodes that are in direct communication range of each other on 2 processors. In the simulation, Node A and B are simulated ﬁrst on the two available processors and Node B reaches TS1 at TW 0 . Similar to the case in Fig. 1, Node B has to wait at TS1 until the simulation time of both Node A and Node C reach TS1 . However, unlike the case in Fig. 1, while Node B is waiting, the simulation of Node C begins and both processors are kept busy. With the

108

Z. Jin and R. Gupta

AEAP synchronization scheme, Node A should send its simulation time to Node B at TW 1 so the simulation of Node B can be resumed. However, since both processors are busy simulating Node A and C at TW 1 , reviving Node B at TW 1 does not increase simulation performance at all. In fact, this may actually slow the simulation down due to the overhead in performing this unnecessary clock synchronization. For example, instead of synchronizing with Node B at TW 1 , Node A can delay the synchronization until a free processor becomes available at TW 2 when Node A needs to read the wireless channel and waits for Node B and C. By delaying the clock synchronization to TS2 at TW 2 , Node A eﬀectively reduces one clock synchronization. Another area that existing AEAP synchronization algorithms [9,10,6] fail to exploit for synchronization reductions is the simulation time gaps among neighboring nodes. Due to the lack of processors to simulate all non-waiting nodes simultaneously, the potential simulation time gaps of diﬀerent nodes can be quite large during a simulation. For example, an actively transmitting node cannot hear transmissions from other nodes and therefore can be simulated without waiting until it stops transmitting and reads the wireless channel. Given such time gaps, a node receiving the simulation time of a node in the future can compare the future node’s time with its own simulation time and calculate potential dependencies between the two nodes in the future. Consequently, the node falling behind can skip clock synchronizations if there are no dependencies between the two nodes. For instance, as shown in Fig. 2, once Node A sends a clock synchronization message to Node B at TS2 , Node B knows implicitly that Node A does not depend on it before TS2 and therefore does not need to synchronize its clock with Node A until then. In other words, Node B no longer needs to send its simulation time to Node A every minimal lookahead time before TS2 as it does with the AEAP synchronization algorithms. By delaying clock synchronizations, we can fully extend the time gaps and as a result create more opportunities for nodes falling behind to act upon and reduce clock synchronizations. We will discuss this in detail in the following section. 3.2

Lazy Synchronization Algorithm

To address the performance issue of the AEAP synchronization scheme, we propose a novel conservative synchronization scheme: LazySync. The key idea of the LazySync scheme is to delay a synchronization even when it should be done according to conservative simulations. It is opposite of opportunistic synchronization in that the simulator seeks to avoid synchronization until it is essential and it is able to do it given simulation resource constraints. Together, we show that the concept of lazy evaluation can be extended to speciﬁcally beneﬁt from the operational characteristics of sensor networks. By procrastinating synchronizations, delayed clock synchronizations may be safely discarded or substituted by newer clock synchronizations in simulating WSNs. As a result, the total number of clock synchronizations in a simulation can be reduced. Note that if free processors are available, our LazySync scheme must perform synchronizations AEAP so potential nodes can be revived to use the available

LazySync: A New Synchronization Scheme for Distributed Simulation

109

physical resources. To make this possible, we track the number of non-waiting nodes and only procrastinate synchronizations when the number is below a threshold. Ideally, the threshold should be set to be the number of processors used to run the simulation in order to maximize clock synchronization reduction and processor usage. However, considering the frequency of checking the number of non-waiting nodes and the overheads in reviving waiting nodes and performing scheduling, the threshold should be set to a number slightly larger than that in practice. Tracking the number of non-waiting nodes on a computer should incur very little overhead since that is already done by the underlying thread/process library or OS as part of their scheduling functions. For distributed simulations on multiple computers, the number of non-waiting nodes on each computer can be exchanged as part of clock synchronization messages sent between computers. If a computer does not receive any clock synchronization messages from another computer for a predetermined period of time, the nodes on the ﬁrst computer can revert back to the AEAP scheme. Our proposed LazySync algorithm is presented in Algorithm 1. As shown in Algorithm 1, we design the LazySync algorithm to work diﬀerently on nodes in diﬀerent states because nodes may have diﬀerent synchronization needs. In a simulation, a sensor node can be in one of two states, the independent state and the dependent state. A node is in the independent state if its radio is not in receiving mode. This happens when the radio is oﬀ, in transmission mode or in any one of the initialization and transition states. Since a node in the independent state (independent node) does not take inputs from any other nodes, it can be simulated without waiting for any other nodes until the state changes. However, if free processors are available, an independent node still needs to synchronize with neighboring nodes so that the nodes depending on the outputs of the independent node can be simulated. In the LazySync algorithm, an independent node checks the number of non-waiting nodes every minimal lookahead time and only sends a clock synchronization message to its neighboring nodes if the number of non-waiting nodes is below a threshold. A node is in the dependent state if its radio is in receiving mode. Since any node in direct communication range of a dependent node (a node in the dependent state) can potentially transmit, a dependent node needs to meet Condition 1 before actually reading the wireless channel to ensure correct simulation results. In other words, a dependent node needs to evaluate Condition 1 to determine if it can read the wireless channel and continue the simulation or has to wait for some neighboring nodes to catch up for their potential outputs. Since a dependent node has its radio in receiving mode, it needs to read the wireless channel at least once every minimal lookahead time (ΔT ) which is the lookahead time of a node in the dependent state. Therefore, Condition 1 is evaluated at least once every ΔT . In the LazySync algorithm, a dependent node only performs clock synchronizations under two circumstances. The ﬁrst circumstance happens when Condition 1 is evaluated to be false and as a result, a dependent node has to wait for neighboring nodes. To prevent deadlocks, a synchronization

110

Z. Jin and R. Gupta

Algorithm 1. Lazy Synchronization Algorithm Require: syncT hreshold /*sync threshold*/ Require: ΔT /*minimal lookahead time, the lookahead time of a node in the dependent state*/ 1: set timer to ﬁre at every ΔT 2: syncT ime ⇐ 0 /*the time a sync condition is veriﬁed*/ 3: while simulation not end do 4: simulate the next instruction 5: if in independent state then 6: if timer.f ired then 7: syncT ime ⇐ current sim time 8: if numLiveN ode < syncT hreshold then 9: send current sim time to all neighboring nodes not ΔT ahead 10: else if in dependent state then 11: if instruction needs to read the wireless channel then 12: syncT ime ⇐ current sim time 13: if ((Condition 1) == true) then 14: if numLiveN ode < syncT hreshold then 15: send current sim time to all neighboring nodes not ΔT ahead 16: else 17: send current sim time to all neighboring nodes not ΔT ahead 18: wait until ((Condition 1) == true) 19: read the wireless channel 20: if syncT ime - (current sim time) > ΔT then 21: syncT ime ⇐ current sim time 22: if numLiveN ode < syncT hreshold then 23: send current sim time to all neighboring nodes not ΔT ahead Condition 1. If a node Ni reads wireless channel Ck at simulation time TSNi , then for all nodes Ns that are in direct communication range of Ni , (TSNs + ΔT ) ≥ TSNi , where TSNs is the simulation time of Ns and ΔT is the lookahead time of Ni which is in the dependent state.

has to be performed in this case before suspending the node, regardless of the number of available processors. A deadlock occurs when nodes wait for each other at the same simulation time. For instance, it happens when nodes within direct communication range read the wireless channel at the same simulation time. The second circumstance occurs when Condition 1 is evaluated to be true so a dependent node can go ahead to read the wireless channel. If the number of non-waiting nodes is below a threshold at this point, a clock synchronization is required to revive some nodes to use the available processors. Note that the block of code from line 20 to 23 in Algorithm 1 is just a safety mechanism to guard against the cases that a node does not stay in any of the two states long enough to check for synchronization conditions. It is important to note that a dependent node may only perform clock synchronizations at the times it reads the wireless channel. This is very diﬀerent from the case in a typical AEAP synchronization algorithm. A node in an AEAP

LazySync: A New Synchronization Scheme for Distributed Simulation

111

synchronization algorithm may perform clock synchronizations at any time according to the waiting times of other nodes. The decision to limit dependent nodes to perform clock synchronizations at channel read time only is based on the assumption that there are no free processors available to simulate any other nodes until an actively running dependent node gives up its processor due to waiting. By procrastinating clock synchronizations to channel read time, we can eliminate all intermediate synchronizations that need to be performed otherwise in AEAP synchronization algorithms, as described in Sect. 3.1. With the LazySync algorithm described above, a node can be simulated for a long period of time without sending its simulation time to neighboring nodes. As discussed in Sect. 3.1, the extended simulation time gaps of neighboring nodes can be exploited eﬀectively to reduce clock synchronizations. According to Condition 1, a dependent node Ni can read the wireless channel only if the simulation time of all neighboring nodes are equal to or greater than the simulation time of Ni minus ΔT . If the simulation time of a neighboring Ns is more than ΔT ahead of the simulation time of Ni , there are no needs for Ni to send its simulation time to Ns until the simulation time of Ni is greater than TSNs − ΔT . The same also applies if an independent node receives a simulation time that is more than ΔT ahead. Based on these, the LazySync algorithm uses a ﬁlter to remove unnecessary clock synchronizations. It is important to see that our LazySync algorithm still follows the principles of conservative synchronization algorithms [1,7,8] to not violate any causality during simulations. We only delay and discard unnecessary clock synchronizations to improve the performance of distributed simulations of WSNs. Due to space limits, a formal correctness proof of the LazySync algorithm is not given here.

4

Implementation

The proposed LazySync scheme is implemented in PolarLite, a distributed simulation framework that we developed based on Avrora [6]. Our simulation framework provides the same level of cycle accurate simulations as Avrora but uses a distributed synchronization engine instead of Avrora’s centralized one. As with Avrora, PolarLite allocates one thread for each simulated node and relies on the Java virtual machine (JVM) to assign runnable threads to any available processors on an SMP computer. However, we cannot identify any Java APIs that allow us to check the number of suspended/blocked threads in a running program. As an alternative, we track that using an atomic variable. The syncT hreshold in Algorithm 1 is conﬁgurable via a command line argument. To implement the LazySync algorithm, we need to detect the state that a node is in. In discrete event driven simulations, the changes of radio states are triggered by events and can be tracked. For example, in our framework, we detect the radio on/oﬀ time by tracking the IO events that access the registers of simulated radios. We verify the correctness of our implementation by running the same simulations with and without the LazySync algorithm using the same random seeds.

112

5

Z. Jin and R. Gupta

Evaluation

To evaluate the performance of the LazySync scheme, we simulate some typical WSNs with PolarLite using both the AEAP synchronization algorithm from [6] and the LazySync algorithm from Sect. 3.2. The performance results are compared according to three criteria: – Speedavg : The average simulation speed. – Syncavg : The average number of clock synchronizations per node. – W aitavg : The average number of waits per node. Speedavg is calculated using Equation (1) based on the deﬁnition speciﬁed in Sect. 1. Note that the numerator of Equation (1) is the total simulation time in units of clock cycles. Syncavg is equal to the total number of clock synchronizations in a simulation divided by the total number of nodes in the simulation. Similarly, W aitavg is equal to the total number of times that nodes are suspended in a simulation due to waiting divided by the total number of nodes in the simulation. Speedavg =

total number of clock cycles executed by the sensor nodes (1) (simulation execution time) × (number of sensor nodes)

The WSNs we simulate in this section consist of only Mica2 nodes [17] running either CountSend (sender) or CountReceive (receiver) programs. Both programs are from the TinyOS 1.1 distribution and are similar to the programs used by other WSN simulators in evaluating their performance [2,9,10]. For example, CountSend broadcasts a continuously increasing counter repeatedly at a ﬁxed interval. If the interval is set to 250ms, it behaves exactly the same as CntToRfm which is used in [2,9,10] for performance evaluations. CountReceive listens for messages sent by CountSend and displays the received values on LEDS. All simulation experiments are conducted on an SMP server running Linux 2.6.24. The server features a total of 8 cores on 2 Intel Xeon 3.0GHz CPUs and 16GBytes of RAM. Sun’s Java 1.6.0 is used to run all experiments. In the simulations, the starting time of each node is randomly selected between 0 and 1 second of simulation time to avoid any artiﬁcial time locks. All simulations are run for 120 seconds of simulation time and for each experiment we take the average of three runs as the results. The synchronization threshold (Algorithm 1) of the LazySync algorithm is set to 9 (the number of processors plus one) for all experiments. 5.1

Performance in One-Hop WSNs

In this section, we evaluate the performance of the LazySync scheme in simulating one-hop WSNs of various sizes. One-hop WSNs are sensor networks with all their nodes in direct communication range. All the one-hop WSNs that we simulate in this section have 50% of the nodes running CountSend and 50% of the nodes running CountReceive.

LazySync: A New Synchronization Scheme for Distributed Simulation Increase of Speedavg

Decrease of Syncavg

Increase of Speedavg

Decrease of Waitavg

Decrease of Syncavg

113

Decrease of Waitavg

60

60

40

%

%

40

20 20 0 0

32

64

96

128

160

192

224

256

288 0 0

-20 Number of Nodes

Fig. 3. Performance improvements of the LazySync scheme over the AEAP scheme in simulating one-hop WSNs. Senders transmit at a 250ms interval.

32

64

96

128

160

192

224

256

288

Number of Nodes

Fig. 4. Performance improvements of the LazySync scheme over the AEAP scheme in simulating one-hop WSNs. Senders transmit as fast as possible.

In the ﬁrst experiment, we modify CountSend so that all senders transmit at a ﬁxed interval of 250ms. Five WSNs with 8, 16, 32, 128 and 256 nodes are simulated and Fig. 3 shows the percentage improvements of the LazySync scheme compared to the AEAP scheme. As shown in Fig. 3, the LazySync scheme reduces Syncavg in all cases and the percentage reductions grow slowly with network sizes. It is important to see that the total number of clock synchronizations in a distributed simulation of a one-hop WSN is on the order of N ∗ (N − 1) where N is the network size [6]. So, although the percentage reductions of Syncavg increase slowly with network sizes in Fig. 3, the actual values of Syncavg decrease signiﬁcantly with network sizes. The signiﬁcant percentage reduction of Syncavg in simulating 8 nodes with 8 processors is due to the time gap based ﬁlter and the fact that the synchronization threshold is only checked every ΔT or at channel read time. Since the threshold is not monitored at a ﬁner time granularity, a processor may be left idle for a maximum of the amount of wallclock time to simulate a node for ΔT according to Algorithm 1. As a result, we can see in Fig. 3 that there are moderate increases of W aitavg when simulating small WSNs with 8 and 16 nodes. However, as the WSN size increases, the percentage reduction of W aitavg increases because processors are more likely to be kept busy by the extra nodes. In fact, the LazySync scheme performs better in terms of percentage reductions of W aitavg when simulating 128 and 256 nodes, as shown in Fig. 3. We believe this is because more CPU cycles become available for real simulations after signiﬁcant reductions in the number of clock synchronizations. For the same reason, despite the increases of W aitavg in simulating small WSNs, we see increases of Speedavg in all cases, ranging from 4% to 46%. Our second experiment is designed to evaluate the LazySync scheme in busy WSNs that have heavy communication traﬃc. It is based on the same setup as the ﬁrst experiment except all senders transmit as fast as possible. As shown in Fig. 4, the LazySync scheme provides more signiﬁcant percentage reductions of W aitavg in busier networks. This is because a busier network has more

114

Z. Jin and R. Gupta Increase of Speedavg (as fast as possible) Decrease of Syncavg (as fast as possible) Decrease of Waitavg (as fast as possible) Increase of Speedavg (250ms) Decrease of Syncavg (250ms) Decrease of Waitavg (250ms) 120

100

%

80

60

40

20

0 0

32

64

96

128

160

192

224

256

288

Number of Nodes

Fig. 5. Performance improvements of the LazySync scheme over the AEAP scheme in simulating multi-hop WSNs

transmissions and consequently more independent states. The increased number of independent states in a busier network provides more opportunities for the LazySync scheme to exploit. It allows nodes to skip synchronizations and gives the ﬁlter larger gaps to exploit. As a result, the LazySync scheme brings a 12% to 53% increase of Speedavg in Fig. 4. We can also see in Fig. 4 that there are no increases of W aitavg in simulating small WSNs as in the ﬁrst experiment. This is because it takes more CPU cycles to simulate all the communications in a busy network and that keeps the processors busy. 5.2

Performance in Multi-hop WSNs

In this section, we evaluate the performance of the LazySync scheme in simulating multi-hop WSNs of various sizes. Nodes are laid 15 meters apart on square grids of various sizes. Senders and receivers are positioned on the grids in such a way that nodes of the same types are not adjacent to each other. By setting a maximum transmission range of 20 meters, this setup ensures that only adjacent nodes are within direct communication range of each other. This conﬁguration is very similar to the two dimensional topology in DiSenS [10]. We simulate WSNs with 16, 36, 100 and 256 nodes. For each network size, we simulate both a quiet network with all the senders transmitting at a ﬁxed 250ms interval and a busy network with all the senders transmitting as fast as possible. The results are shown in Fig. 5. We can see that the percentage decreases of Syncavg are more signiﬁcant in the multi-hop networks than in the one-hop networks. The

LazySync: A New Synchronization Scheme for Distributed Simulation

115

reason for this is that there are fewer dependencies among nodes in our multi-hop networks than in the one-hop networks, as a result of only having adjacent nodes in communication range in the multi-hop network setup. Having fewer dependencies brings two opportunities to the LazySync scheme. First, a node can be simulated for a longer period of time without waiting. Second, the increased number of non-waiting nodes keeps processors busy. Together, they enable nodes to skip clock synchronizations in LazySync. In addition, the increased simulation time gaps can also be exploited by LazySync to reduce clock synchronizations. As shown in Fig. 5, the percentage reductions of Syncavg are signiﬁcantly higher in the busy multi-hop networks than in the quiet ones. This demonstrates once again that the LazySync scheme can exploit wireless transmissions in a WSN for synchronization reductions. As a result, we see signiﬁcant percentage increases of Speedavg in simulating busy multi-hop networks, ranging from 25% to 118%.

6

Conclusion and Future Work

We have presented LazySync, a synchronization scheme that signiﬁcantly improves the speed and scalability of distributed sensor network simulators by reducing the number of clock synchronizations. We implemented LazySync in PolarLite and evaluated it against an AEAP scheme inside the same simulation framework. The signiﬁcant improvements of simulation performance on a multi-processor computer in our experiments suggest even greater beneﬁts in applying our techniques to distributed simulations over a network of computers because of their large overheads in sending synchronization messages across computers during simulations. As future work, we are planning to combine LazySync with some other performance increasing techniques that we developed in the past [6,13]. Since these techniques exploit diﬀerent aspects of WSNs for performance improvements, we believe combining the techniques can further improve the speed and scalability of distributed WSN simulators.

References 1. Chandy, K.M., Misra, J.: Asynchronous distributed simulation via a sequence of parallel computations. Commun. ACM 24(4), 198–206 (1981) 2. Levis, P., Lee, N., Welsh, M., Culler, D.: Tossim: accurate and scalable simulation of entire tinyos applications. In: SenSys 2003: Proceedings of the 1st international conference on Embedded networked sensor systems, pp. 126–137. ACM Press, New York (2003) 3. Shnayder, V., Hempstead, M., rong Chen, B., Allen, G.W., Welsh, M.: Simulating the power consumption of large-scale sensor network applications. In: SenSys 2004: Proceedings of the 2nd international conference on Embedded networked sensor systems, pp. 188–200. ACM, New York (2004) 4. Polley, J., Blazakis, D., McGee, J., Rusk, D., Baras, J.: Atemu: a ﬁne-grained sensor network simulator. In: 2004 First Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks. IEEE SECON 2004, October 4-7, 2004, pp. 145–152 (2004)

116

Z. Jin and R. Gupta

5. Landsiedel, O., Alizai, H., Wehrle, K.: When timing matters: Enabling time accurate and scalable simulation of sensor network applications. In: IPSN 2008: Proceedings of the 2008 International Conference on Information Processing in Sensor Networks, Washington, DC, USA, pp. 344–355. IEEE Computer Society Press, Los Alamitos (2008) 6. Jin, Z., Gupta, R.: Improved distributed simulation of sensor networks based on sensor node sleep time. In: Nikoletseas, S.E., Chlebus, B.S., Johnson, D.B., Krishnamachari, B. (eds.) DCOSS 2008. LNCS, vol. 5067, pp. 204–218. Springer, Heidelberg (2008) 7. Fujimoto, R.M.: Parallel and distributed simulation. In: WSC 1999: Proceedings of the 31st conference on Winter simulation, pp. 122–131. ACM Press, New York (1999) 8. Riley, G.F., Ammar, M.H., Fujimoto, R.M., Park, A., Perumalla, K., Xu, D.: A federated approach to distributed network simulation. ACM Trans. Model. Comput. Simul. 14(2), 116–148 (2004) 9. Titzer, B.L., Lee, D.K., Palsberg, J.: Avrora: scalable sensor network simulation with precise timing. In: IPSN 2005: Proceedings of the 4th international symposium on Information processing in sensor networks, Piscataway, NJ, USA, pp. 477–482. IEEE Press, Los Alamitos (2005) 10. Wen, Y., Wolski, R., Moore, G.: Disens: scalable distributed sensor network simulation. In: PPoPP 2007: Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 24–34. ACM Press, New York (2007) 11. Henderson, T.: NS-3 Overview (2008) 12. Jeﬀerson, D.R.: Virtual time. ACM Trans. Program. Lang. Syst. 7(3), 404–425 (1985) 13. Jin, Z., Gupta, R.: Improving the speed and scalability of distributed simulations of sensor networks. Technical Report CS2009-0935, UCSD (2009) 14. Filo, D., Ku, D.C., Micheli, G.D.: Optimizing the control-unit through the resynchronization of operations. Integr. VLSI J. 13(3), 231–258 (1992) 15. Liu, J., Nicol, D.M.: Lookahead revisited in wireless network simulations. In: PADS 2002: Proceedings of the sixteenth workshop on Parallel and distributed simulation, Washington, DC, USA, pp. 79–88. IEEE Computer Society Press, Los Alamitos (2002) 16. Hughes, J.: Why functional programming matters. Comput. J. 32(2), 98–107 (1989) 17. Crossbow: MICA2 Datasheet (2008)

Similarity Based Optimization for Multiple Query Processing in Wireless Sensor Networks Hui Ling1 and Taieb Znati1,2 1

Department of Computer Science Telecommunication Program, University of Pittsburgh, Pittsburgh, PA, USA 15260 {hling,znati}@cs.pitt.edu 2

Abstract. Wireless sensor networks (WSNs) have been proposed for a large variety of applications. As the number of applications of sensor networks continue to grow, the number of users in sensor networks increases as well. Consequently, it is not uncommon that base station often needs to process multiple queries simultaneously. Furthermore, these queries often need to collect data from some particular sets of sensors such as the sensors in a hot spot. To reduce the communication cost of multiple query processing in WSNs, this paper proposes a new optimization technique based on similarities among multiple queries. Given a set of queries, Q, the proposed scheme constructs a set of shared intermediate views (SIVs) from Q. Each SIV identiﬁes a set of shared data among queries in Q. The SIVs, are processed only once, but reused by at least two queries in Q. The queries in Q, are rewritten into a diﬀerent set of queries, Q . The col lected sensor data from Q and SIVs, are aggregated and returned as the processing results for the original set of queries in Q. The simulation results show that the proposed technique can eﬀectively reduce the communication cost of multiple query processing in WSNs.

1

Introduction

Wireless sensor networks have been proposed for a large variety of applications, including environmental monitoring, disaster relief and traﬃc monitoring. They enable us to observe and interact with the real world, which was diﬃcult, expensive or even impossible to monitor otherwise. To ease the programming and deployment of sensors, researchers have proposed to add querying capability to individual sensors and treat the whole sensor network as a database. Users express their interests as queries and send them to the base station. The base station then delivers these queries to sensors in the network. Relevant sensors generate data, process the queries and transfer the data back to base station. Since sensors are typically resource constrained with limited power, processing capability and bandwidth, special techniques are developed to optimize the processing of queries in sensor networks. For example, in network processing and data aggregation is proposed to reduce data communication cost [1][2]. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 117–130, 2009. c Springer-Verlag Berlin Heidelberg 2009

118

H. Ling and T. Znati

As the applications of sensor networks continue to grow, the number of users in sensor networks increases as well. As a result, it is not uncommon that base station needs to process multiple queries simultaneously. The cost of processing a query consists of the communication cost of disseminating query to sensors in the network, the computational cost of query processing at relevant sensor nodes and the communication cost of transferring data back to base station from relevant sensors. Given that data communication consumes much more energy than CPU processing in sensor nodes [3], the computational cost of query processing at sensor nodes can be neglected in comparison with the communication cost. The cost of processing multiple queries, therefore, is the summation of the cost of processing every query if they are processed separately. However, queries may seek data from some common set of sensors such as these sensors in a hot spot. For example, in a sensor network for ﬁre monitoring, one user issues a query “list the number of regions with temperature above 200F and light level between 20 and 30” and another user issues a similar query “list the number of regions with temperature above 250F and light level between 25 and 35”. Apparently, the data of sensors who sense temperature above 250F and light level between 25 and 30 must be used for both queries during processing. Therefore, these data must be transmitted and/or aggregated twice during processing if these two queries are processed separately. However, there is no need to transfer these data twice if the shared data between these two queries can be identiﬁed and reused. The redundant data transmission may become signiﬁcant when a large number of data are shared among many queries. The problem, therefore, is to optimize the processing cost of multiple queries in sensor networks, in particular when these queries need to collect data from common subsets of sensors during processing, i.e. they are similar. This problem is especially challenging since data in sensor network is generated at sensors and aggregated results may be returned to base station to reduce communication cost during query processing. The sensor data, once aggregated, is diﬃcult to be reused at base station for multiple query optimization. In this paper, we propose a novel approach to address the optimization problem for multiple queries in sensor networks. The proposed method is based on the similarities among queries. A model is developed to estimate if queries need common data from a set of sensors based on the constraints of attributes speciﬁed from these queries and how the sensed data is distributed among sensor nodes. Given a set of queries, a shared intermediate view set (SIVS) is constructed. Each SIV in the SIVS captures the similarity among some queries. The original set of queries are then rewritten into a diﬀerent set of queries in such a way that the processing results from these queries can be aggregated with the processing results from SIVS to provide the correct results for the original set of queries. The key idea is that the SIVS is processed only once but the results for SIVS returned to base station can be reused multiple times. Through simulations, it is shown that the proposed method is simple yet eﬀective. Furthermore, it compliments many existing optimization techniques such as optimal query execution plan, and can be easily combined with these existing optimizations.

Similarity Based Optimization for MQP in WSNs

119

The rest of paper is organized as follows. Section 2 discusses the related work. In section 3, a model of estimating similarities among queries is developed and the multiple query optimization scheme including shared intermediate view construction and query rewriting are presented in detail. A set of simulations are conducted to evaluate the proposed scheme in section 4. Section 5 concludes this paper.

2

Related Work

Several sensor database query systems, such as Cougar [4] and TinyDB [5], have been developed by the database researchers. These work aim to extend SQL-like system for sensor networks by focusing on reducing the power consumption during query processing. In addition to these two pioneer systems, a large number of work have been conducted to address many other aspects of query processing techniques for sensor networks. An energy eﬃcient routing scheme for data collection from all nodes in sensor network is proposed in [6]. In-network processing and data aggregation is presented in [1][2]. To answer aggregated queries such as “average” or “min”, sensor data can be aggregated at intermediate sensor nodes to reduce the amount of data being transferred in the network during processing. The work in [7] compliments sensors with statistical data models to provide more meaningful query results and reduce the number of message transmission during data collection. Most of these work have mainly focused on the optimization and execution of a single long running query. The multiple query processing, in particular optimization problem has been studied by database researchers [8]. The focus of multiple query optimization (MQO) in sensor network, however is diﬀerent since data in sensor network is spread over all sensors and aggregated results are usually returned to base station during query processing. The sensor data, once aggregated, is diﬃcult to be reused at base station for multiple query optimization. New schemes, therefore, must be proposed to address these new challenges. The MQO problems are recently addressed by several researchers [9][10][11][12]. The scheme presented at [9] explores spatial query information for multi-query optimization. The notion, equivalence classes (EC), is deﬁned as the union of all regions covered by the same set of queries. A query is, then, expressed as a set of ECs intersecting with its query region. The experimental results show that great energy saving can be achieved using the proposed optimization technique. The impact of MQO is analyzed in the work described at [10]. A cost model is developed to study the beneﬁt of exploiting common subexpressions in queries. The authors also propose several optimization algorithms for both data acquisition queries and aggregation queries that intelligently rewrite multiple sensor data queries (at the base station) into “synthetic” queries to eliminate redundancy among them before they are injected into the wireless sensor network. The set of running synthetic queries is dynamically updated by the arrival of new queries as well as the termination of existing queries. The scheme is then extended into a Two-Tier Multiple Query Optimization (TTMQO) scheme [11]. The ﬁrst tier, called base station optimization, adopts a

120

H. Ling and T. Znati

cost-based approach to rewrite a set of queries into an optimized set that shares the commonality and eliminates the redundancy among the queries in the original set. The optimized queries are then injected into the wireless sensor network. The second tier, called in-network optimization, eﬃciently delivers query results by taking advantage of the broadcast nature of the radio channel and sharing the sensor readings among similar queries over time and space at a ﬁner granularity. These proposed schemes for MQO [9][11] have explored spatial or temporal information among queries to reduce transmission cost of multi-query processing. In this paper, we propose to investigate a ﬁner granularity, the semantic similarity among multiple queries, to further optimize the multiple query processing in sensor networks. The problem of “Many-to-Many aggregation” in sensor network is addressed in [12], where destinations require data from multiple sensors while sensor data are also needed by multiple destinations. The idea of multicast and in network aggregation is combined together to reduce the communication cost. The goal is to minimize the communication cost by balancing the combination of multicast and in-network aggregation. The problem is similar to our problem in that the data at source sensors are needed multiple times. But we are addressing a problem in a more general context. A similar problem of computing multiple aggregations in stream processing is studied in [13]. In stream processing, many users run diﬀerent, but often similar queries against the stream. Several techniques are developed to ﬁnd commonalities among aggregated queries with same or diﬀerent predicates and windows. The proposed approach is particularly eﬀective in handling query join and leave in streaming system. The problem diﬀers from our problem in that all raw data are available in streaming systems. In our problem, the raw readings are generated at sensor nodes and only aggregated results are returned to base station.

3

Similarity Based Multiple Query Processing

In this section, we explore the opportunities arisen in multiple queries for reducing query processing cost in sensor networks. Speciﬁcally, the similarity among queries are identiﬁed and utilized to reduce data collection cost. 3.1

Query Definition

We use the following simple declarative language to deﬁne user queries. The language deﬁnes variable, predicate and rule, based on which a query is deﬁned. Definition 1. A variable, V , can be the name of a data attribute sensed by nodes in the network, location of sensors, or temporal specification of data sampling. Definition 2. A predicate, P , is in the format of < V op constant >. op is an arithmetical operator, , ≤, ≥, =, =. Each P , specifies a filter on the data to be collected.

Similarity Based Optimization for MQP in WSNs

121

Definition 3. A rule, R = (R ∧ P ) || P , is either a conjunction of predicates or a simple predicate. Definition 4. A query, q, is in the format of AF (V )?R1 ∨ R2 ∨ · · · ∨ Rm . AF specifies the aggregate function on variable V , such as Max, Min, Avg. A query, q, essentially speciﬁes a set of ﬁlters on sensor data to be collected, in addition to the spatial-temporal constraints. If a spatial constraint is given to a query q, then q is only interested at sensor data in a certain area. Otherwise, by default, a query seeks data from all sensors in the network. The temporal constraints specify the interval of query processing. Based on the value of temporal variable, user queries can be classiﬁed into, snapshot query which is only executed once, and long-live query which collects data from sensor network repeatedly during a speciﬁed time period, T , in a speciﬁed interval, I. q is eventually mapped to a set of sensor nodes, whose data meet all the rules in q and other constraints. 3.2

Overview

Given a set of queries, Q, a set of intermediate views are ﬁrst constructed from Q in multiple query processing. Each intermediate view identiﬁes a set of shared data among two or more queries in Q. The original queries, Q, are rewritten into a diﬀerent set, Q . These intermediate views, along with the new rewritten queries in Q , are then processed by sensors in the network. Each query, q ∈ Q, is mapped to several intermediate views and an additional set of sensors in the network. The results from these intermediate views, aggregated with the data collected from the additional set of sensors, provide the necessary data to answer query q. Figure 1 illustrates the overall process of data collection in similarity based query processing. 3.3

Shared Intermediate Views

The key idea to reduce query processing cost for multiple queries in sensor network is to identify the similarities among queries. Two queries are similar if they collect data from a common subset of sensors. The shared data among similar queries only need to be collected and processed one time if they are identiﬁed before query processing. The question, therefore, is how to identify the similarity among queries? To this end, “Shared Intermediate Views (SIV)”, is deﬁned to capture the similarity among queries. In the query deﬁnition language described at section 3.1, a rule, R, is a conjunction of predicates. A query, q, uses a disjunction of rules to specify the conditions of data to be collected. Let Rules(q) be the disjunction set of rules q speciﬁes. A shared intermediate view is deﬁned as follows: Definition 5. Given two queries, q1 and q2 , a shared intermediate view, SIV, of (q1 , q2 ) is a query AF(V)?R, where AF(V) is the same as q1 and q2 , and R = Rules(q1 ) ∧ Rules(q2).

122

H. Ling and T. Znati End users query

Shared intermediate views (SIVs)

q1

V2 V1

Rewritten query

qm

Vk

q’1

q’m

Sensor data

Fig. 1. Overview of similarity based multi-query processing

Based on Deﬁnition 5, we deﬁne a shared intermediate view set(SIVS) for a set of queries, Q as following: Definition 6. Given a set of queries, Q, a shared intermediate view set, SIVS is a set of queries. ∀ q ∈ SIV S, ∃ qi , qj ∈ Q, q is a shared intermediate view of qi and qj . Apparently, many diﬀerent SIVS exist for a set of queries, Q. The selection of SIVS is critical to the multiple query processing since the results of these queries are reused for processing queries in Q. The overall query processing cost can only be reduced if the data collection cost saved by using SIVS is bigger than the extra cost of processing views in SIVS. The goal, therefore, is to derive a SIVS which enables the maximum reusing of shared data among queries. To derive such a SIVS, the size of shared data between queries must be known. The knowledge, however, can not be obtained before the relevant sensor nodes are identiﬁed for each query. As a result, heuristics should be explored for the maximization of reusing shared data. It is observed that a sensor essentially reports numerical values of the attributes being sensed. Furthermore, a query speciﬁes what sensor data to collect by deﬁning ﬁlters over these attributes. The value range of attributes speciﬁed in a query, q, directly determines the number of sensors whose data satisfy q. Therefore, we propose to use the range as an indication of the size of shared data among queries. Next, we present a simple range based algorithm for SIVS construction. 3.4

Range Based SIVS Construction

Assume the value range of an attribute, A, is within a range [a, a ]. A predicate, p = A < a1, collects data from sensors where A is within [a, a1]. Similarly, the value range of attribute A can be easily determined for all predicates. A rule, R,

Similarity Based Optimization for MQP in WSNs

123

is a conjunction of predicates over the data, T , to be sought. The predicates may enforce constraints on a set of diﬀerent attributes, AT T R = {A1 , A2 , · · ·}. Each attribute describes one dimension of data T . Furthermore, the potential value range of attributes also varies from one to another. The same range constraint over diﬀerent attributes, therefore, may be mapped to a diﬀerent size of data set in the network. The relative range (RR), then is used to compare the number of sensors whose sensed data satisfy constraints among diﬀerent attributes. Given a predicate, P (A), of attribute A, whose value is within the interval [min(A), max(A)], it is estimated that P (A) = al ≤ A ≤ au is mapped to a set of RR(P (A)) × n sensors in the network, where RR(P (A)), is deﬁned as following: RR(P (A)) =

au − a l max(A) − min(A)

(1)

A rule, R, may specify more than one predicate on a data attribute, A. These predicates, P 1 (A), P 2 (A), ..., P k (A) form a composite predicate P red(R, A) = P 1 (A) ∧ P 2 (A) ∧ · · · ∧ P k (A). The estimated relevant data size of P red(R, A) is as following: RR(P red(R, A)) =

min1≤i≤k au (P i (A)) − max1≤i≤k al (P i (A)) max(A) − min(A)

(2)

When R imposes diﬀerent value ranges over diﬀerent attributes of T , the estimated size of relevant data in network for R, RR(R) is estimated as the minimum relative range of attributes that R speciﬁes over T . RR(R) =

min

∀A∈AT T R(R)

RR(P red(R, A))

(3)

The estimated relevant data size of conjunction of rules is computed as following: RR(Ri ∧ Rj ) =

min

∀A∈AT T R(Ri )∪AT T R(Rj )

RR(P red(Ri , A) ∧ P red(Rj , A))

(4)

Equation 5 is used to compute the estimated relevant data size of disjunction of rules. RR(Ri ∨ Rj ) = RR(Ri ) + RR(Rj ) − RR(Ri ∧ Rj ) (5) Using the equations above, we are able to quantify the similarity between two queries. Given two queries, q1 , q2 , where Rules(q1) = R11 ∨ R12 ∨ · · · ∨ R1l1 , Rules(q2 ) = R21 ∨ R22 ∨ · · · ∨ R2l2 , the similarity between q1 and q2 , SIM (q1 , q2 ), is deﬁned as: SIM (q1 , q2 ) = RR(Rules(q1) ∧ Rules(q2)) = RR( (R1i ∧ R2j ))

(6)

1≤i≤l1 ,1≤j≤l2

Equation 6 provides us a model of estimating shared data among two queries. The model, however, requires an exponential number of computation. Following

124

H. Ling and T. Znati

equation 5, the disjunction of l1×l2 rules needs to compute 2l1×l2 −1 intermediate values before the ﬁnal value can be obtained. Instead of using accurate values, we then explore lower and upper bound of similarity among queries to estimate the shared data size among them. From equation 5 and 4, it can be derived that: RR(Ri ∨ Rj ) = RR(Ri ) + RR(Rj ) − RR(Ri ∧ Rj ) ≤ RR(Ri ) + RR(Rj ) and RR(Ri ∨ Rj ) = RR(Ri ) + RR(Rj ) − RR(Ri ∧ Rj ) ≥ max(RR(Ri ), RR(Rj )) Therefore, a lower and upper bound of the estimated relevant data size of conjunction of two rules can be derived as following: max(RR(Ri ), RR(Rj )) ≤ RR(Ri ∨ Rj ) ≤ RR(Ri ) + RR(Rj )

(7)

It is straightforward to extend inequality 7 to compute the conjunction of r rules: r r max RR(Ri ) ≤ RR( Ri ) ≤ RR(Ri ) (8) 1≤i≤r

i=1

i=1

Given 8, we can now derive a lower and upper bound for similarity among two queries, q1 and q2 . max RR(R1i ∧R2j ) ≤ SIM (q1 , q2 ) ≤ RR(R1i ∧R2j ) (9) 1≤i≤l1 ,1≤j≤l2

1≤i≤l1 ,1≤j≤l2

Similarly, we can deﬁne another up bound of RR(Ri ∨ Rj ) as following: RR(Ri ∨ Rj ) = RR( P red(Ri , A) ∨ P red(Rj , A)) A∈AT T R(Ri )

A∈AT T R(Rj )

= RR(

(P red(Ri , A1 ) ∨ P red(Rj , A2 )))

A1 ,A2 ∈AT T R(Ri )∪AT T R(Rj )

≤

min

A∈AT T R(Ri )∪AT T R(Rj )

RR(P red(Ri , A) ∨ P red(Rj , A))

(10)

As a result, another up bound of the similarity among two queries, q1 and q2 , can be deﬁned as: SIM (q1 , q2 ) = RR( (R1i ∧ R2j )) 1≤i≤l1 ,1≤j≤l2

≤

min

A∈AT T R

RR(

P red(R1i ∧ R2j , A) )

(11)

1≤i≤l1 ,1≤j≤l2

The lower and upper bound in inequality 9 and 11 gives three potential estimations of SIM (q1 , q2 ). These estimations provide us a way to measure the

Similarity Based Optimization for MQP in WSNs

125

similarities between two queries. Using a similar derivation procedure, the similarity among m queries where m is bigger than 2, can be deﬁned. However, the accuracy of estimation tends to decrease as more computations are ignored during the derivation when the number of queries increases. In this paper, only the similarities between two queries are explored for multiple query optimization. Based on the quantiﬁed similarity among queries, an iteration based algorithm is then proposed to construct a SIVS. The main steps are described at Algorithm 1. In each iteration, a pair of query qi , qj in Q with maximum similarity is ﬁrstly chosen and added into set S = {qi , qj }. SIV = AF (V )? q∈S Rules(q) is added into the current SIVS and Q = Q − S. The iteration continues till Q = ∅ or no query in Q shares data with other queries. This algorithm can discover the SIVS with maximum similarity when only the similarity between two queries are explored. Algorithm 1. SIVS Construction 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

3.5

INPUT: Q = {q1 , q2 , · · · , qm } INITIALIZATION: S = ∅; SIV S = ∅ repeat pick the pair of query qi , qj with maximum SIM (qi , qj ) in Q S = {qi , qj }; Q = Q − {qi , qj } SIV = AF (V )? q ∈S Rules(q ); SIV S = SIV S ∪ {SIV } until (Q = ∅ || SIM (qi , qj ) == 0 ∀ qi , qj ∈ Q) OUTPUT: SIV S

Data Collection Scheme Using SIVS

After a SIVS is derived for a set of queries, Q, the shared intermediate views in SIVS are ﬁrstly sent into sensor network. Relevant data are collected and processed using existing communication scheme such as [14]. In order to reuse the result from these shared intermediate views, the queries in Q are modiﬁed before they are delivered into the sensor network for processing. In principle, for a SIV ∈ SIV S, if SIV is the shared intermediate view for queries in S ⊆ Q, then for each query, q ∈ S, q is replaced with q = AF (V )?Rules(q)∧¬Rules(SIV ). The algorithm for query rewriting is presented at Algorithm 2. These modiﬁed queries in Q are then delivered into sensors in the network. The relevant data are collected and an aggregated result is returned to the base station. To answer an original query q in Q, the aggregated result from q and SIV of q are aggregated together, using the aggregation function speciﬁed in q. Using the proposed data collection scheme, the ﬁnal result for q is the aggre gated result from q and SIV . To ensure that the correctness of such aggregation for query q, we must guarantee the following two conditions, 12 and 13 to be true.

126

H. Ling and T. Znati

Algorithm 2. Query Rewriting 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:

INPUT: Q = {q1 , q2 , · · · , qm }, SIV S INITIALIZATION: Q =Q for all (SIV ∈ SIV S) do S = {q|q ∈ Q and SIV is a shared view of q} for all q ∈ S do Q = Q − q; Q = Q ∪ {q = AF (V )?Rules(q) ∧ ¬Rules(SIV )} end for end for OUTPUT: Q

Data(q) == Data(q ) ∪ Data(SIV )

Data(q ) ∩ Data(SIV ) == ∅

(12) (13)

Condition 12 means that all sensor data collected by q is collected by either q or SIV . Since q is constructed as AF (V )?Rules(q) ∧ ¬Rules(SIV ), it is easy to see all data in Data(q ) meets Rules(q) ∧ ¬Rules(SIV ). Data in Data(SIV ), on the other hand, satisfy condition Rules(SIV ). Therefore, data in Data(q ) ∪ Data(SIV ) meets constraint (Rules(q) ∧ ¬Rules(SIV )) ∨ Rules(SIV ). The constraint is equivalent to Rules(q) ∨ Rules(SIV ). It is straightforward to see that Rules(q) ∨ Rules(SIV ) = Rules(q) since Rules(SIV ) = Rules(q) ∧ · · · ∧ Rules(qj ). Condition 13 requires that no duplicate data exists between Data(q ) and Data(SIV ). Some aggregation function such as “Average” is duplication sensitive. Collecting a data more than one time results in an inaccurate result. This condition, therefore, is used to ensure that the aggregation of aggregated result for q and SIV is always the same as aggregated result for q for any aggrega tion function. The condition, is obviously true since Rules(q ) ∧ Rules(SIV ) = Rules(q) ∧ ¬Rules(SIV ) ∧ Rules(SIV ) = ∅.

4

Simulation Results

We develop a simulator in C++ and conduct a set of simulations to investigate the eﬀectiveness of the proposed optimization technique. A total number of 100 nodes are deployed in an area of 50m × 50m, following a uniformly random distribution. For each given data attribute, a node uniformly generates a value within the range of the attribute. In the simulation, the saved number of data transmission of the proposed multiple query processing is collected. The saved number of data transmission between two queries, q1 , q2 , is deﬁned as the number of nodes whose data satisfy both queries, i.e. the data size of intermediate views for q1 and q2 . The saved number of data transmission, is then deﬁned as the

Similarity Based Optimization for MQP in WSNs

127

summation of the data size of all intermediate views. In the simulation, each query consists of a random number of rules which is between 1 and 3. The number of predicates speciﬁed at each rule is generated randomly between 1 and the number of attributes being sensed at sensors. Each predicate enforces a random ranging constraint over one of the attributes being sensed at sensor nodes randomly. The queries run 500 times for each simulation. During each run of the simulations, the data being sensed at each sensor node is randomly changed within the pre deﬁned range. “SIMLow” and “SIMUp1” is the lower and upper bound in inequality 9 respectively. “SIMUp2” is the up bound in inequality 11. Figure 2 presents the result of saved number of data transmission for a set of 20 queries, while diﬀerent number of data attributes are being sensed at sensors in the network. The results show that the number of saved data transmission decreases using the proposed optimizations in general as the number of attributes increases. It is due to the fact that the similarities among queries depend on the predicates deﬁned over data attributes in the queries. The more types of data a sensor senses, the more constraints a query can enforce, and the less similarity queries may share. We also investigate the eﬀect of number of queries on the performance of the proposed scheme. In this set of simulation studies, a varying number of queries, Saved number of data transmission

300 SIMUp1 SIMUp2 SIMLow

250 200 150 100 50 0 0

1

2

3

4

5

6

Number of Attributes

Fig. 2. Saved number of data transmission for 20 queries in a network of 100 nodes

Saved number of data transmission

900 SIMUp1 SIMUp2 SIMLow

800 700 600 500 400 300 200 100 0 0

10

20 30 40 Number of Queries

50

60

Fig. 3. Saved number of data transmission for 1 data attribute

128

H. Ling and T. Znati

Saved number of data transmission

350 SIMUp1 SIMUp2 SIMLow

300 250 200 150 100 50 0 0

10

20

30

40

50

60

Number of Queries

Fig. 4. Saved number of data transmission for 2 data attributes

Saved number of data transmission

200 SIMUp1 SIMUp2 SIMLow

180 160 140 120 100 80 60 40 20 0 0

10

20 30 40 Number of Queries

50

60

Fig. 5. Saved number of data transmission for 3 data attributes

Saved number of data transmission

80 SIMUp1 SIMUp2 SIMLow

70 60 50 40 30 20 10 0 0

10

20 30 40 Number of Queries

50

60

Fig. 6. Saved number of data transmission for 4 data attributes

ranging from 10 to 50 in an interval of 10, is processed in a network of sensors sensing 1 to 4 data attributes. Figure 3, 4, 5 and 6 presents the saved number of data transmission of varying number of queries when 1, 2, 3 and 4 data attributes are being sensed on the data to be collected respectively. The results in Figure 3, 4, 5 and 6 indicate that the saved number of data transmission increases as the number of queries increases. This is as we have expected since the similarities among queries shall increase when the number

Similarity Based Optimization for MQP in WSNs

129

of queries increases. None of the three proposed estimation techniques, namely “SIMLow”, “SIMUp1” and “SIMUp2” outperforms the others in every scenario. When only one data attribute is being sensed in the network, “SIMUp2” reduces the most number of data transmission for multiple query processing. When more than one data attributes are sensed, “SIMUp1” tends to reduce more data transmissions than the other two techniques. These results suggest that when the similarity among queries is very high (i.e. the case of 1 attribute), the bound “SIMUp2” should be used in the estimation model in the proposed optimization scheme. In contrast, “SIMUp1” should be used for multiple query optimization in other scenarios.

5

Conclusion and Future Work

In this paper, we consider the problem of optimizing the processing for multiple query processing in sensor networks. The problem arises when multiple queries need to be processed in base station and they share some level of similarities. We build a model to estimate the similarity among queries based on the constraints of attributes speciﬁed from these queries and how the sensed data is distributed among sensor nodes. Given a set of queries, a shared intermediate view set (SIVS) is constructed. Each SIV in SIVS captures the similarity among queries. The original set of queries are then rewritten into a diﬀerent set of queries in such a way that the processing results from these queries and SIVS can be aggregated to provide the correct results for the original set of queries. The key idea is that the SIVS is processed only once but the results for SIVS returned to base station can be reused multiple times. Through simulations, it is shown that the proposed method is simple yet eﬀective. Currently, the model assumes independence of sensor readings. In the future, we plan to look at how to modify the estimation model to incorporate the correlation among sensor readings into the estimation model.

Acknowledgments This work is supported by NSF awards 0325353, 0549119 and 0729456.

References 1. Madden, S., Franklin, J.M., Hellerstein, M.J., Hong, W.: TAG: a Tiny AGgregation Service for Ad-hoc Sensor Networks. In: SIGOPS Operating System Review. ACM Press, New York (2002) 2. Srivastava, U., Munagala, K., Widom, J.: Operator Placement for in-network Stream Query Processing. In: The twenty-fourth ACM SIGMOD-SIGACTSIGART symposium on Principles of database systems (PODS), Baltimore, Maryland, USA (2005)

130

H. Ling and T. Znati

3. Mainwaring, A., Culler, D., Polastre, J., Szewczyk, R., Anderson, J.: Wireless Sensor Networks for Habitat Monitoring. In: Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications (WSNA), Georgia, Atlanta, USA (2002) 4. Yao, Y., Gehrke, J.: The Cougar Approach to in-network Query Processing in Sensor Networks. In: SIGMOD Record. ACM, New York (2002) 5. Madden, S., Franklin, J.M., Hellerstein, M.J., Hong, W.: TinyDB: an Acquisitional Query Processing System for Sensor Networks. In: ACM Transaction Database System. ACM, New York (2005) 6. Silberstein, A., Braynard, R., Yang, J.: Constraint Chaining: on Energy-eﬃcient Continuous Monitoring in Sensor Networks. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD), Chicago, IL, USA (2006) 7. Deshpande, A., Guestrin, C., Madden, S., Hellerstein, J., Hong, W.: Model-driven Data Acquisition in Sensor Networks. In: Proceedings of Conference on Very Large Data Bases (VLDB), Toronto, Canada (2004) 8. Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Eﬃcient and Extensible Algorithms for Multi Query Optimization. In: SIGMOD Record. ACM, New York (2000) 9. Trigoni, N., Yao, Y., Demers, J.A., Gehrke, J., Rajaraman, R.: Multi-query Optimization for Sensor Networks. In: Prasanna, V.K., Iyengar, S.S., Spirakis, P.G., Welsh, M. (eds.) DCOSS 2005. LNCS, vol. 3560, pp. 307–321. Springer, Heidelberg (2005) 10. Xiang, S., Lim, B.H., Tan, K.L.: Impact of Multi-query Optimization in Sensor Networks. In: Proceedings of the 3rd workshop on Data management for sensor networks (DMSN), Seoul, Korea (2006) 11. Xiang, S., Lim, B.H., Tan, K.L., Zhou, Y.L.: Two-Tier Multiple Query Optimization for Sensor Networks. In: Proceedings of the 27th International Conference on Distributed Computing Systems (ICDCS), Toronto, Canada (2007) 12. Silberstein, A., Yang, J.: Many-to-Many Aggregation for Sensor Networks. In: Proceedings of the 2007 IEEE International Conference on Data Engineering (ICDE), Istanbul, Turkey (2007) 13. Krishnamurthy, S., Wu, C., Franklin, J.M.: On-the-Fly Sharing for Streamed Aggregation. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD), Chicago, IL, USA (2006) 14. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed Diﬀusion: A Scalable and Robust Communication Paradigm for Sensor Networks. In: Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (MOBICOM), Boston, MA, USA (2000)

Finding Symbolic Bug Patterns in Sensor Networks Mohammad Maiﬁ Hasan Khan, Tarek Abdelzaher, Jiawei Han, and Hossein Ahmadi Department of Computer Science University of Illinois at Urbana-Champaign 201 North Goodwin, Urbana, Illinois, USA [email protected], {zaher,hanj}@cs.uiuc.edu, [email protected]

Abstract. This paper presents a failure diagnosis algorithm for summarizing and generalizing patterns that lead to instances of anomalous behavior in sensor networks. Often multiple seemingly diﬀerent event patterns lead to the same type of failure manifestation. A hidden relationship exists, in those patterns, among event attributes that is somehow responsible for the failure. For example, in some system, a message might always get corrupted if the sender is more than two hops away from the receiver (which is a distance relationship) irrespective of the senderId and receiverId. To uncover such failure-causing relationships, we present a new symbolic pattern extraction technique that identiﬁes and symbolically expresses relationships correlated with anomalous behavior. Symbolic pattern extraction is a new concept in sensor network debugging that is unique in its ability to generalize over patterns that involve different combinations of nodes or message exchanges by extracting their common relationship. As a proof of concept, we provide synthetic traﬃc scenarios where we show that applying symbolic pattern extraction can uncover more complex bug patterns that are crucial to the understanding of real causes of problems. We also use symbolic pattern extraction to diagnose a real bug and show that it generates much fewer and more accurate patterns compared to previous approaches. Keywords: symbolic pattern, interactive bugs, wireless sensor network.

1

Introduction

Wireless sensor network applications typically implement distributed protocols where multiple nodes communicate with each other to collectively perform a collaborative task. Nodes often assume roles such as cluster heads, sensors, or forwarding nodes. Messages have types, usually deﬁned by the respective applications. Such applications often fail due to some unexpected sequences of (communication or other) events, which are not handled properly by the protocol design and/or due to some implementation oversight that leads to a “bad” state which eventually leads to the failure. In this paper, we generalize from actual observed message exchanges to the underlying relationships deﬁned on B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 131–144, 2009. c Springer-Verlag Berlin Heidelberg 2009

132

M.M.H. Khan et al.

nodes, roles, and message types, that lead to a failure. We call them symbolic bug patterns. Unfortunately, none of the existing debugging tools and techniques available for sensor networks is capable of troubleshooting symbolic bugs. State of the art debugging techniques for sensor network include passive diagnosis [10,16], usage of declarative tracepoints to collect runtime logs for oﬄine analysis [4], discriminative sequence mining for oﬄine analysis of runtime logs [8,9], realtime failure diagnosis [18], traditional breakpoints and watchpoint primitives to probe into hardware at runtime [24,23] and more traditional techniques such as simulation [11,17,21], emulation [7], and testbeds [22,5]. Though source code level debugging tools [24,23] help identify single node programming errors, these are not very eﬀective to diagnose interaction bugs. Dustminer [9] comes closest to our work. It uses frequent sequence mining techniques for bug diagnosis. Though it makes an eﬀort towards using sequence mining for ﬁnding interaction bugs, in this paper we show that analyzing sequences of events based on absolute event attribute values to diagnose such bugs is not enough. To make things worse, the resulting patterns can often be misleading and can confuse the application developer. In this paper, to diagnose complex bug patterns, we introduce the concept of symbolic patterns that identify the “culprit” sequences of events responsible for failure by capturing the relationships among diﬀerent event attributes. In the context of this paper, a symbolic pattern is a pattern where all or a subset of the absolute values of event attributes within the pattern are replaced with symbols to generalize the pattern. To perform oﬄine analysis using symbolic pattern extraction, diﬀerent types of runtime events are logged during program execution and oﬄine analysis is done to identify the discriminative set of frequent symbolic patterns that will contain the “culprit” symbolic patterns that are highly correlated to failure. The rest of the paper is organized as follows. In section 2 we introduce the model for symbolic patterns. In Section 3, we describe the state of the art in debugging tools and techniques speciﬁcally developed for sensor network applications and explain their limitations. In Section 4, we explain the mechanism behind symbolic pattern extraction and the pattern ranking scheme. We compare and evaluate the debugging capability of symbolic pattern extraction to prior related schemes [8,9] in Section 5 using a synthetic bug and a real bug. Finally, Section 6 concludes the paper.

2

A Model for Symbolic Patterns

The logged events in our system can include any operations performed at runtime such as message transmission, message reception, and writing to ﬂash storage. Each recorded event can have multiple attributes. For example, a message transmission event can have senderId, senderType, destinationId and msgType as attributes. For the purposes of the discussion below, let us deﬁne an event to be the basic element in the event log that is analyzed for failure diagnosis. The

Finding Symbolic Bug Patterns in Sensor Networks

133

format of a logged event and the deﬁnition of sequences of events are similar to those deﬁned in [9]: < EventT ype, attribute1 , attribute2 , ...attributen >. For example, attribute1 can be SenderId in case of a messageSent event. The generated log can be thought of as a single sequence of events. For example, consider the following logged events in a sample log: < msgSent, senderId = 1, msgT ype = 0, destinationId = 3 > < msgReceived, receiverId = 1, msgT ype = 1, senderId = 3 > < f lashW riteInitiated, nodeId = 1, dataSize = 100 >

The above log can be considered a single sequence of three events each with multiple attributes. A frequent sequence mining algorithm [1] is used to extract frequent subsequences of events. Events in the subsequence do not have to be contiguous in the original sequence. We use the term “frequent (sub)sequence of events” and “frequent pattern” interchangeably in this paper. A discriminative pattern between two logs is an ordered subsequence of events that occurs with a diﬀerent support in the two logs, where support refers to the number of times it occurs. The larger the diﬀerence in support, the better the discriminative power. Before we formally deﬁne a symbolic pattern, let us consider the following example to illustrate what it means. Say, we have two patterns S1 and S2 where each pattern has two events with multiple attributes as follows: S1 =< msgSent, senderId = 1, msgT ype = 0 > < msgReceived, receiverId = 2, msgT ype = 0 > S2 =< msgSent, senderId = 3, msgT ype = 0 > < msgReceived, receiverId = 5, msgT ype = 0 >

where node 1 is the neighbor of node 2 and node 3 is the neighbor of node 5. On the surface, patterns S1 and S2 are diﬀerent. Now, if we parameterize the relationship that exists between senderId and receiverId and represent it using symbol X for senderId, S1 and S2 can be represented as follows: S1 =< msgSent, senderId = X, msgT ype = 0 > < msgReceived, receiverId = neighbor(X), msgT ype = 0 > S2 =< msgSent, senderId = X, msgT ype = 0 > < msgReceived, receiverId = neighbor(X), msgT ype = 0 >

Interestingly, S1 and S2 now become the same pattern which expresses a more general relationship. Note that, if S1 and S2 each has support 1, the symbolic version has support 2 and hence symbolizing patterns increase the visibility of the pattern in the event log. More formally, in the context of this paper, symbolic pattern extraction is the task of identifying frequent patterns that satisfy certain relationships, speciﬁed by the user or selected from a library of common relationships, deﬁned among event attributes of same or diﬀerent types of events (e.g., neighborhood relationship, identity relationship, and type relationships). These relationships are then represented using symbols instead of absolute values where appropriate. In this paper, we present an algorithm for symbolic pattern extraction which at ﬁrst, generates frequent patterns using the Apriori algorithm [1]. Next, it generalizes frequent patterns generated in the ﬁrst stage by mining for “relationships” in those patterns. In this paper, we also present a hybrid scheme for counting the

134

M.M.H. Khan et al.

support for individual patterns which greatly enhances the chance of identifying “infrequent” events that are correlated with failure. Finally, we propose a pattern ranking scheme exploiting the characteristics of symbolic patterns which increases the usability of the tool. To analyze the performance of symbolic pattern extraction, we simulated several bugs in TOSSIM to generate log ﬁles and analyzed them using our new algorithm. We choose simulation to generate log ﬁles as it gives us the ﬂexibility to experiment with bugs of arbitrary complexity. We compare discriminative symbolic patterns generated by our symbolic pattern extraction algorithm with the discriminative patterns generated by the algorithm we presented earlier [9] and show that symbolic patterns greatly enhance the diagnostic capability and the usability of the tool.

3

Related Work

State of the art debugging techniques for wireless sensor networks include passive diagnosis [10,16], sequence mining for oﬄine analysis of runtime logs [8,9], and real-time failure diagnosis [18]. To increase the visibility inside the node a recent eﬀort [4] proposed the usage of declarative tracepoints to collect runtime logs for oﬄine analysis. To facilitate debugging single-node programming bugs (e.g., bad pointer reference, stack overﬂow etc.), sophisticated tools such as Clairvoyant [24] and Marionette [23] are developed that provide standard debugging primitives such as breakpoints and watchpoints that enable stepping through execution on a line-by-line basis. Though these tools are very useful to debug programming or localized errors, these are not very useful if the cause of the failure is distributed across multiple nodes. Moreover stepping through code execution can cause the bug to disappear or to create new problems, if they are timing-related. At the other extreme of the spectrum lie debugging tools such as SNTS [10] which try to debug a deployed sensor network by analyzing passively recorded radio communication messages in the network. Although passive diagnosis does not interfere with the operation of the network, the diagnostic capability is rather limited due to unavailability of “critical” run time information. Sympathy [18] performs real-time diagnosis using a classiﬁcation tree approach and uses a minimal amount of run-time data collected at a central node to perform the diagnosis. The diagnosis is done based on reduced throughput in the network. Dustminer [9] uses ﬂash on the chip to log runtime events and uses sequence mining for oﬄine analysis to diagnose the cause of the problem. As each node records logs locally and does not upload data in real time, it does not compete for the radio to communicate and hence minimizes the interference. In [9], the authors identiﬁed several limitations of an existing sequence mining algorithm [1] and extended it to address those issues. None of the above techniques can ﬁnd symbolic patterns automatically. SNMS [20] presents a more traditional sensor network management service that collects and summarizes diﬀerent types of measurements such as packet

Finding Symbolic Bug Patterns in Sensor Networks

135

loss and radio energy consumption. Although laboratory test-beds like Motelab [22], Kansei [5], and Emstar [7] provide the convenience of testing in a controlled environment, they do not provide hints to the developer if something goes “wrong”(e.g., some random nodes stopped after 2 hours) during the testing. Other work [19] shows that erroneous sensor readings such as temperature and humidity can be used to predict network and node failures but it does not provide the answer to the question “Why was the sensor reading bad in the ﬁrst place?”. Using machine learning techniques to diagnose failure is not new [3,2,6,12,14,15]. Discriminative frequent pattern analysis [6], software behavior graph analysis [14], a Bayesian analysis based approach [13], and control ﬂow analysis to identify logic error [15] are a few examples. These techniques do not focus on extracting symbolic relationships, however.

4

Overview

To answer the question “Why do we need discriminative symbolic pattern extraction to debug interaction bugs?”, we provide an example in Section 4.1. We then present the symbolic pattern extraction algorithm in Section 4.2. We conclude the section by presenting a hybrid support count function and a pattern ranking scheme that have signiﬁcant impact on the quality of the patterns generated and the scalability of the algorithm. 4.1

Motivation for Using Symbolic Pattern for Debugging

Let us assume that, in a particular application, each neighbor of node A periodically communicates with node A and is always expected to send messages of type 0, 1 and 2 in a ﬁxed order, where msgType 0 is followed by msgType 1 and msgType 2, respectively. Also, assume that any violation of this message order from a speciﬁc sender crashes the system. Now, let us log a few examples of correct execution (Good Log) and execution that leads to a manifestation of error (Bad Log). Consider the log ﬁle presented in Table 1, collected from node 1 and node 7, where node 1 did not crash and node 7 crashed. Note that, node 7 crashed as node 8 sent messages violating the required sequence of message types. If we generate the patterns correlated with failure, a state of the art algorithm would come up with the following: pattern seq1 with support 2 and pattern seq2 with support 1 along with other frequent patterns. seq1 =< msgReceived, msgT ype = 0 > < msgReceived, msgT ype = 1 > < msgReceived, msgT ype = 2 > seq2 =< msgReceived, msgT ype = 2 > < msgReceived, msgT ype = 0 > < msgReceived, msgT ype = 1 >

136

M.M.H. Khan et al. Table 1. Sample Log File

Good Log (Node 1)

Bad Log (Node 7)

1. < msgReceived, receiverId = 1, senderId = 3, msgT ype = 0 > 2. < msgReceived, receiverId = 1, senderId = 3, msgT ype = 1 > 3. < msgReceived, receiverId = 1, senderId = 3, msgT ype = 2 > 4. < msgReceived, receiverId = 1, senderId = 2, msgT ype = 0 > 5. < msgReceived, receiverId = 1, senderId = 2, msgT ype = 1 > 6. < msgReceived, receiverId = 1, senderId = 2, msgT ype = 2 > 1. < msgReceived, receiverId = 7, senderId = 6, msgT ype = 0 > 2. < msgReceived, receiverId = 7, senderId = 6, msgT ype = 1 > 3. < msgReceived, receiverId = 7, senderId = 8, msgT ype = 2 > 4. < msgReceived, receiverId = 7, senderId = 8, msgT ype = 0 > 5. < msgReceived, receiverId = 7, senderId = 8, msgT ype = 1 > 6. < msgReceived, receiverId = 7, senderId = 6, msgT ype = 2 >

If we inspect the logged events presented in table 1 carefully, we can see that there is also a pattern associated with the senderId, where senderId is the same for seq1 . For the ﬁrst and second occurrence of seq1 , senderId is 3 and 2 respectively. This pattern is missed due to diﬀerent support. For example, logged event < msgReceived, receiverId = 1, senderId = 3, msgT ype = 0 > has support 1 and < msgReceived, receiverId = 1, msgT ype = 0 > has support 3. On the other hand, if we parameterize the values of receiverId and senderId and replace the identical values with symbols where receiverId is replaced with X and senderId with Y in Good Log, the following pattern will be identiﬁed with support 2 where for the ﬁrst occurrence X = 1 and Y = 3 and for the second occurrence X = 1 and Y = 2. < msgReceived, receiverId = X, senderId = Y, msgT ype = 0 > < msgReceived, receiverId = X, senderId = Y, msgT ype = 1 > < msgReceived, receiverId = X, senderId = Y, msgT ype = 2 > Similarly, if we extract symbolic patterns, we would be able to identify that the following pattern seq3 occurs only in Bad log but not in the Good log < msgReceived, receiverId = X, senderId = Y, msgT ype = 2 > < msgReceived, receiverId = X, senderId = Y, msgT ype = 0 > < msgReceived, receiverId = X, senderId = Y, msgT ype = 1 > Without symbolic pattern extraction there is no way of identifying seq3 . A more detailed description of the Symbolic pattern extraction algorithm is presented in section 4.2. 4.2

Symbolic Pattern Extraction Algorithm

Symbolic pattern extraction is a two step process. – During the ﬁrst stage, multiattribute events are converted into single attribute events to reduce the computational complexity. Frequent patterns of events with single attribute are generated using a current sequence mining algorithm [1]. Let us call this set of frequent pattern the base f requent set. – At the second stage, the candidate symbolic patterns are generated from this base f requent set. If the symbolic pattern si has support supsi which is

Finding Symbolic Bug Patterns in Sensor Networks

137

generated from the base pattern pi with support suppi and (supsi /suppi ) > δ, then pi is replaced by si . δ is the equivalence threshold which is set by the user. If δ is set to 0, all the symbolic patterns are retained and if it is set to 1 then symbolic patterns with the exact same support as the base pattern are retained. The generation of candidate symbolic patterns is described below. Generation of Candidate Symbolic Patterns. To explain the generation of candidate symbolic patterns, without loss of generality, let us assume that Seqa is a frequent base pattern of three events where each event is of diﬀerent type and includes a single attribute from each event type. Seqa = (< Ex , attr2 = vi >, < Ey , attr2 = vj >, < Ez , attr3 = vk >) Say, event < Ex > originally has 3 attributes and Seqa includes only the second attribute of < Ex >. Similarly, we assume < Ey > and < Ez > originally have 2 and 3 attributes respectively. Next, the algorithm reconstructs the equivalent, complete pattern where each event has all the attributes. Now, the equivalent pattern generated from Seqa would look like as follows: (< Ex , attr1 = ∗ >, < Ex , attr2 = vi >, < Ex , attr3 = ∗ >) (< Ey , attr1 = ∗ >, < Ey , attr2 = vj >) (< Ez , attr1 = ∗ >, < Ez , attr2 = ∗ >, < Ez , attr3 = vk >) Here “*” is used for the attributes that are not included in the original pattern which basically says that the “*” attributes are “don’t care”. Next, the algorithm replaces a subset of the “*” attributes with symbols and mine for relationship among those symbolic attributes. The symbolic pattern replaces Seqa if the support of the symbolic pattern in the original log is “similar” to the support of Seqa . 4.3

Challenges

Meaningful Condition Identiﬁcation. “Which subset of “*” attributes to replace with symbols and what “relationship” to test for?” is one of the key questions in ﬁnding meaningful symbolic bug patterns. We need to decide this intelligently to avoid useless checking such as “checking if nodeId and timeStamp are equal or not in a particular event”. Our goal is to automate the process as much as possible. To reduce the user involvement, we provide a list of predeﬁned conditions that are especially applicable for wireless sensor network applications. For example, the common attributes expected for wireless sensor network applications are nodeId, message types, sensor data types, timestamps, etc. We tried to come up with the basic conditions that need to be checked. For example, checking if the “neighbor” condition holds between senderId and receiverId makes sense. The user needs to specify the type of the attribute in a header ﬁle. For example, if the type of ith attribute of event Ex is “nodeId”, user may specify that information as (< Ex , attri , type : nodeId) from which the tool automatically determines the set of applicable conditions for this attribute. From that information, combinations of conditions of arbitrary complexity such as “Is the senderId always same as the receiverId?”, or “Does the msgType has to be

138

M.M.H. Khan et al.

X and sender has to be the immediate neighbor to crash the receiver?” and so on can be generated automatically. We realize that there may be conditions which are not provided by us. If the user wants to check for conditions that are not provided by a library function, he/she may implement the desired condition and add it to our library. A user may specify a condition that is not provided by the tool as follows: (< Ex , attri >, < Ey , attrj >, Conditionq ) where Conditionq is deﬁned and implemented by the user for his/her speciﬁc application. A pseudocode of the algorithm is given in table 2. Table 2. Symbolic Pattern Extraction Algorithm Algorithm: Symbolic Pattern Extraction Input: Set of Good Logs (GL), Set of Bad Logs(BL),similarity measure (δ) Output: Set of discriminative symbolic pattern 1. PatternSetA=GenerateFrequentPatterns(GL) 2. SymbolicPatternSetA=ExtractSymbolicPattern(PatternSetA,GL,δ) 3. PatternSetB=GenerateFrequentPatterns(BL) 4. SymbolicPatternSetB=ExtractSymbolicPattern(PatternSetB,BL,δ) 5. DiscriminativePatternSet=DiﬀMine(SymbolicPatternSetA,SymbolicPatternSetB) 6. output DiscriminativePatternSet Function: ExtractSymbolicPattern Input: Set of Frequent Pattern(FP),Set of Logs(L),similarity measure (δ) Output: Set of symbolic pattern(SP) 1. SP=Null;/ ∗ setof Symbolicpattern ∗ / 2. for each pattern p in FP 2.1 for each checkcondition c 2.1.1 CSP=GenerateCandidateSymbolicPattern(p,c) 2.1.2 if(support(CSP)/support(p)> δ then SP=SP U CSP 2. return SP

Scalability. One of the problems with symbolic pattern mining is that the number of combinations of conditions to check is exponential. For example, consider the following symbolic candidate pattern (< Ex , attr1 = ∗ >, < Ex , attr2 = vi >, < Ex , attr3 = ∗ >) (< Ey , attr1 = ∗ >, < Ey , attr2 = vj >) (< Ez , attr1 = ∗ >, < Ez , attr2 = ∗ >, < Ez , attr3 = vk >) Now, assume that the applicable set of conditions that need to be checked for this pattern are: c1 : (< Ex , attr1 >, < Ey , attr1 >, IdentityCondition) c2 : (< Ex , attr1 >, < Ez , attr1 >, IdentityCondition) c3 : (< Ey , attr1 >, < Ez , attr1 >, IdentityCondition) c4 : (< Ex , attr2 >, < Ez , attr3 >, LessT hanCondition) The possible combinations of conditions are 2N oOf ApplicableConditions − 1 where for the above example N oOf ApplicableConditions = 4.

Finding Symbolic Bug Patterns in Sensor Networks

139

To reduce the number of combinations to check we apply the following heuristic which is based on the apriori property. Informally, apriori property states that for a combination of n conditions to be satisﬁed, any subset of those n conditions must also be satisﬁed. To exploit this property, at ﬁrst, we check for single conditions and try to reduce the number of applicable conditions. For example, if c1 is not satisﬁed, we do not need to check any combination that includes c1. Next, we check for conditions in increasing length. For example, assume conditions c2, c3 and c4 are satisﬁed. We check which combinations of (c2, c3), (c2, c4), and (c3, c4) are satisﬁed. If all of the length-2 combinations are satisﬁed, we check if (c2, c3, c4) is satisﬁed or not. Symbolic Pattern Ranking. The discriminative pattern extraction algorithm often returns patterns with same or very similar support. In the case of non symbolic patterns, there is no clear way to decide which patterns should be ranked as the more important ones. Fortunately, in the case of symbolic patterns, we have a convenient way to rank the patterns. We applied a simple scheme where we give more importance to patterns that are more speciﬁc. To do that, we simply count the number of “*” in a symbolic pattern. The higher the number of “*” in a pattern, the lower the rank, as it is likely to be a self-evident generality that does not carry much information. The rationale behind this is due to the fact that “*” implies “don’t care” and hence patterns having more “*” are more likely to have higher support but represent a weaker concept. In contrast, patterns with fewer “*” give more information and should be ranked higher. 4.4

Hybrid Support Count Function

One of the inherent problems with any discriminative pattern extraction algorithm is that the number of patterns generated as discriminant patterns is overwhelmingly large, which can be in the order of thousands. It makes it “easy” to miss the “culprit” pattern which may end up deep down the list of discriminative patterns. As at each stage i, the candidate set is generated by concatenating the frequent patterns generated at stage (i − 1) with each of the unique events in the log ﬁle, for 100 unique events (e.g. the alphabet equivalent of English language) in a log ﬁle, the number of candidate patterns of length 3 is 10,00,000 and so on. To avoid losing crucial events, we have to set the minimum support threshold to 1 (e.g., a single node reboot event may cause a large number of message losses and setting minimum support threshold larger than 1 will discard the “reboot” event). To address this challenge, in [9] we proposed a two stage approach which ﬁrst identiﬁes symptoms (e.g., message loss) with setting high minimum support threshold and later tries to identify the cause of failure(e.g., reboot) with lower support. Although this addresses the scalability issue to some extent, the patterns returned in this scheme still fail to return the “culprit” sequence at the top of the list if the cause of failure is infrequent (e.g., large number of message lost due to single node reboot). The cause of the problem lies in the way support for an event is calculated in the frequent sequence mining algorithm in the data mining domain, which is ill-suited

140

M.M.H. Khan et al.

for debugging purposes. The reason is that if we have N log ﬁles and an event X exists in only one ﬁle 1000 times and does not happen in any of the other ﬁles, X will still be considered a frequent event with support 1000. But for debugging purpose, this is “wrong”. As the reasoning behind using discriminative pattern extraction for debugging relies on the assumption that an event correlated with failure should “exist” in at least a majority of the “Bad” log ﬁles. Event X in fact violates this assumption and is not a frequent event from a debugging perspective. To address this problem, we have implemented a support count function that counts the frequency of patterns not only within a single log ﬁle but also across multiple log ﬁles and uses both estimates to generate support for single attribute events. For example, according to our scheme, if an event X “exists” in only one of the N log ﬁles, the across support for X is 1 irrespective of how many times it happened in that single ﬁle. For example, though “reboot” event has a lower support in a single ﬁle, it has a higher support across the ﬁles (“reboot” exists in all the ﬁle for cases that crashed). Using this observation, we discard events from the base set that have across support lower than a threshold θ which is set by the user (i.e., θ = 0.6 implies that for an event to be frequent, it has to “exist” in at least 60% of the ﬁles). We have two sets of frequent events (i.e., alphabet set), one for the set of good logs and one for the set of bad logs which are subsequently used to generate longer patterns. This reduces the execution time signiﬁcantly and helps rank the patterns that are more correlated with failure higher than the other “less” correlated patterns. 4.5

Collection of Logs

To use the tool, one must collect runtime logs from the application nodes. As long as the runtime logs follow the format speciﬁcation required by the data analysis backend, the source of the logs does not matter. Logs can be collected from simulation, emulation or from real hardware. For example, if the user intends to use TOSSIM, user can log any event inside the application using TOSSIM’s “dbg” statement as in dbg(“Channel , “%d : %d : %d : %d....”, N odeId, EventId, attr1 , attr2 , ....) as described in [8]. The user can also use the data collection front end designed for real hardware described in [9] to collect runtime logs from real deployment or choose to build his/her own data collection front end.

5

Evaluation

To evaluate the diagnostic capability using discriminative symbolic pattern analysis, we used TOSSIM in TinyOS 2.0 to simulate the nesC code where we used a synthesized bug to create the sample log ﬁles. We simulated a network of 25 nodes placed on a grid topology with 5 rows and 5 columns. For the simulated bug, we compare the generated symbolic patterns with the non symbolic patterns generated by the algorithm presented in [9]. We choose to compare our result with [9] as [9] is the most related to our work that uses discriminative patterns to diagnose bugs.

Finding Symbolic Bug Patterns in Sensor Networks

5.1

141

Synthesized Bug

In this section we give examples of a synthesized bug to illustrate the strength of symbolic discriminative pattern extraction for debugging. – Failure scenario-I: Out of order events, deterministic failure:

Let us assume that in a particular application, each neighbor of node A periodically communicates with node A and is always expected to send the messages of type 0, 1 and 2 in a ﬁxed order where msgType 0 is followed by msgType 1 and msgType 2 respectively. Also assume that message reception in reverse order from a speciﬁc sender crashes the system. The discriminative pattern set returned by both algorithms from simulated logs for this failure scenario are given in table 3. The ﬁrst discriminative symbolic pattern captured the bug perfectly where it expressed the fact that if a particular receiver(X1) receives from a particular sender(X2) messages in reverse order of msgType where msgType 2 is followed by msgType 1 and msgType 0 respectively, there is a problem. Not surprisingly, the algorithm borrowed from [9] generated completely misleading patterns with highest support. Though it returned the pattern “(< msgReceived : (msgT ype : 2) >, < msgReceived : (msgT ype : 1) >, < msgReceived :

(msgT ype : 0) >)” at the very end of the list, it failed to identify the crucial condition that this sequence causes a problem only if the messages are received from the same sender.

Table 3. Top Patterns for Failure scenario - I Patterns generated by [9] 1. < msgSent : (msgT ype : 2) >, < msgSent : (MsgT ype : 0) >, < msgSent : (SenderT ype : 0) > 2. < msgReceived : (msgT ype : 1) >, < msgReceived : (senderT ype : 0) >, < msgSent : (msgT ype : 2) >

Patterns generated by Symbolic Pattern Extraction 1. < msgReceived : (ReceiverId : X1), (SenderId : X2), (SenderT ype : ∗), (MsgT ype : 2) > < msgReceived : (ReceiverId : X1), (SenderId : X2), (SenderT ype : ∗), (MsgT ype : 1) > < msgReceived : (ReceiverId : X1), (SenderId : X2), (SenderT ype : ∗), (MsgT ype : 0) >

5.2

A Real Bug: Directed Diﬀusion Protocol Bug

We used the bug reported in [8] where a node experiences a large number of message losses after a node is rebooted in the directed diﬀusion protocol. For a detailed description of the bug, interested readers are encouraged to read [8]. Brieﬂy, in the directed diﬀusion protocol, each node maintains an interest cache entry to keep track of which way to forward a data packet. There can be multiple paths from a single data source to the destination node. If there is no interest cache entry that matches a received packet’s interest description, the receiver node silently discards the packet assuming it is not on the forwarding path. The problem is if a node gets rebooted for some reason, it wipes out the interest cache completely and causes a large number of consecutive message losses. The problem manifests only if there is a single path from the source node to the destination node. This bug is particularly interesting because in [8] the reported discriminative patterns showed the

142

M.M.H. Khan et al.

manifestation of the problem rather than showing that the “Reboot” event is the one that is actually causing the problem. For the log generated for this bug, the discriminative pattern set returned by the symbolic pattern extraction algorithm is given in table 4. Symbolic patterns identiﬁed the real cause of the problem and correctly correlated the cause of failure and the manifestation. It clearly shows that the “Boot” event is followed by the interest cache empty event and message is dropped due to no matching interest cache entry. Table 4. Top Patterns for Directed Diﬀusion Protocol Bug Patterns reported in [8] 1. < interestCacheEmpty : N odeId : 3 >, < dataCacheEmpty : N odeId : 3 >, < dataMsgSent : T imeStamp : 20 > 2. < interestCacheEmpty : N odeId : 3 >, < dataCacheEmpty : N odeId : 3 >, < dataMsgSent : N odeId : 4 > 3. < interestCacheEmpty : N odeId : 3 >, < dataCacheEmpty : N odeId : 3 >, < dataMsgSent : msgT ype : 5

Patterns generated by Symbolic Pattern Extraction 1. < BOOT EV EN T : (N odeId : X1) >, < interestCacheEmpty : (N odeId : X1) >, < dataCacheEmpty : (N odeId : X1) > 2. < BOOT EV EN T : (N odeId : X1) >, < msgDropped : (N odeId : X1), (ReasonT oDrop : dataW ithN oMatchingInterest), (T imeStamp : ∗) >, < interestCacheEmpty : (N odeId : X1) > 3. < BOOT EV EN T : (N odeId : X1) >, < msgDropped : (N odeId : X1), (ReasonT oDrop : dataW ithN oMatchingInterest), (T imeStamp : ∗) >, < dataCacheEmpty : (N odeId : X1) >

5.3

Performance Comparison

For the log ﬁles collected for the directed diﬀusion protocol bug, we used three good logs and three bad logs and analyzed using symbolic pattern extraction algorithm to generate patterns. Using symbolic pattern extraction, it took less than 1 hour and returned 188 symbolic patterns of length 3. In comparison, when we applied algorithm from [8], it took more than 3 hours and returned over several thousand patterns. This is due to the fact that in our approach, we were able to discard many unimportant events that had low support across multiple log ﬁles and thus reduced the number of the base events that were used to generate longer patterns.

6

Conclusion

The concept of discriminative symbolic pattern extraction is introduced in this paper which is a new concept both in wireless sensor networks and data mining domains. We demonstrated the power of symbolic patterns using several bug scenarios. From a comparison of patterns reported by [8] and [9] with patterns

Finding Symbolic Bug Patterns in Sensor Networks

143

generated using discriminative symbolic pattern extraction, the strength of the symbolic approach for debugging purposes is clear. The new algorithm is better in terms of pattern expressiveness, concept generalization and discovery of hidden patterns, some of which are much harder to notice using traditional pattern mining algorithms which mine based on absolute attribute values rather than abstract symbols. Discriminative symbolic pattern extraction for debugging adds an invaluable technique to the arsenal of debugging techniques available in the wireless sensor networks domain. Acknowledgments. This work was funded in part by NSF grants CNS 0626342, CNS 05-53420, and CNS 05-54759.

References 1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the Twentieth International Conference on Very Large Data Bases (VLDB 1994), pp. 487–499 (1994) 2. Aguilera, M.K., Mogul, J.C., Wiener, J.L., Reynolds, P., Muthitacharoen, A.: Performance debugging for distributed systems of black boxes. In: Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP 2003), Bolton Landing, NY, USA, pp. 74–89 (2003) 3. Bod´ık, P., Friedman, G., Biewald, L., Levine, H., Candea, G., Patel, K., Tolle, G., Hui, J., Fox, A., Jordan, M.I., Patterson, D.: Combining visualization and statistical analysis to improve operator conﬁdence and eﬃciency for failure detection and localization. In: Proceedings of the 2nd International Conference on Autonomic Computing (ICAC 2005) (2005) 4. Cao, Q., Abdelzaher, T., Stankovic, J., Whitehouse, K., Luo, L.: Declarative tracepoints: A programmable and application independent debugging system for wireless sensor networks. In: Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys), Raleigh, NC, USA (2008) 5. Ertin, E., Arora, A., Ramnath, R., Nesterenko, M.: Kansei: A testbed for sensing at scale. In: Proceedings of the 4th Symposium on Information Processing in Sensor Networks (IPSN/SPOTS track) (2006) 6. Fatta, G.D., Leue, S., Stegantova, E.: Discriminative pattern mining in software fault detection. In: Proceedings of the 3rd international workshop on Software quality assurance (SOQUA 2006), pp. 62–69 (2006) 7. Girod, L., Elson, J., Cerpa, A., Stathopoulos, T., Ramanathan, N., Estrin, D.: Emstar: a software environment for developing and deploying wireless sensor networks. In: Proceedings of the annual conference on USENIX Annual Technical Conference (ATEC 2004), Boston, MA, p. 24 (2004) 8. Khan, M.M.H., Abdelzaher, T., Gupta, K.K.: Towards diagnostic simulation in sensor networks. In: Proceedings of International Conference on Distributed Computing in Sensor Systems (DCOSS), Greece (2008) 9. Khan, M.M.H., Le, H.K., Ahmadi, H., Abdelzaher, T.F., Han, J.: Dustminer: Troubleshooting interactive complexity bugs in sensor networks. In: Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys), Raleigh, NC, USA (2008)

144

M.M.H. Khan et al.

10. Khan, M.M.H., Luo, L., Huang, C., Abdelzaher, T.: Snts: Sensor network troubleshooting suite. In: Aspnes, J., Scheideler, C., Arora, A., Madden, S. (eds.) DCOSS 2007. LNCS, vol. 4549, pp. 142–157. Springer, Heidelberg (2007) 11. Levis, P., Lee, N., Welsh, M., Culler, D.: Tossim: accurate and scalable simulation of entire tinyos applications. In: Proceedings of the 1st international conference on Embedded networked sensor systems (SenSys 2003), Los Angeles, California, USA, pp. 126–137 (2003) 12. Liu, C., Fei, L., Yan, X., Han, J., Midkiﬀ, S.P.: Statistical debugging: A hypothesis testing-based approach. IEEE Transactions on Software Engineering 13. Liu, C., Lian, Z., Han, J.: How bayesians debug. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006) (December 2006) 14. Liu, C., Yan, X., Fei, L., Han, J., Midkiﬀ, S.P.: Sober: statistical model-based bug localization. In: Proceedings of the 13th ACM SIGSOFT international symposium on Foundations of software engineering (FSE 13), Lisbon, Portugal (2005) 15. Liu, C., Yan, X., Han, J.: Mining control ﬂow abnormality for logic error isolation. In: Proceedings of 2006 SIAM International Conference on Data Mining (SDM 2006), Bethesda, MD (April 2006) 16. Liu, K., Li, M., Liu, Y., Li, M., Guo, Z., Hong, F.: Pad: Passive diagnosis for wireless sensor networks. In: Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys), Raleigh, NC, USA (2008) 17. Polley, J., Blazakis, D., McGee, J., Rusk, D., Baras, J.S.: Atemu: A ﬁne-grained sensor network simulator. In: Proceedings of the First International Conference on Sensor and Ad Hoc Communications and Networks (SECON 2004), pp. 145–152, Santa Clara, CA (October 2004) 18. Ramanathan, N., Chang, K., Kapur, R., Girod, L., Kohler, E., Estrin, D.: Sympathy for the sensor network debugger. In: Proceedings of the 3rd international conference on Embedded networked sensor systems (SenSys 2005) (2005) 19. Szewczyk, R., Polastre, J., Mainwaring, A., Culler, D.: Lessons from a sensor network expedition. In: Karl, H., Wolisz, A., Willig, A. (eds.) EWSN 2004. LNCS, vol. 2920, pp. 307–322. Springer, Heidelberg (2004) 20. Tolle, G., Culler, D.: Design of an application-cooperative management system for wireless sensor networks. In: Proceedings of the Second European Workshop on Wireless Sensor Networks (EWSN 2005), Turkey, February 2005, pp. 121–132 (2005) 21. Wen, Y., Wolski, R., Moore, G.: Disens: scalable distributed sensor network simulation. In: Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP 2007), San Jose, California, USA, pp. 24–34 (2007) 22. Werner-Allen, G., Swieskowski, P., Welsh, M.: Motelab: A wireless sensor network testbed. In: Proceedings of the Fourth International Conference on Information Processing in Sensor Networks (IPSN 2005), Special Track on Platform Tools and Design Methods for Network Embedded Sensors (SPOTS) (April 2005) 23. Whitehouse, K., Tolle, G., Taneja, J., Sharp, C., Kim, S., Jeong, J., Hui, J., Dutta, P., Culler, D.: Marionette: Using rpc for interactive development and debugging of wireless embedded networks. In: Proceedings of the Fifth International Conference on Information Processing in Sensor Networks: Special Track on Sensor Platform, Tools, and Design Methods for Network Embedded Systems (IPSN/SPOTS), Nashville, TN, pp. 416–423 (April 2006) 24. Yang, J., Soﬀa, M.L., Selavo, L., Whitehouse, K.: Clairvoyant: a comprehensive source-level debugger for wireless sensor networks. In: Proceedings of the 5th international conference on Embedded networked sensor systems (SenSys 2007), pp. 189–203 (2007)

Distributed Continuous Action Recognition Using a Hidden Markov Model in Body Sensor Networks Eric Guenterberg, Hassan Ghasemzadeh, Vitali Loseu, and Roozbeh Jafari Embedded Systems and Signal Processing Lab Department of Electrical Engineering University of Texas at Dallas, Dallas, TX 75080 {etg062000,h.ghasemzadeh,vitali.loseu,rjafari}@utdallas.edu

Abstract. One important application of Body Sensor Networks is action recognition. Action recognition often implicitly requires partitioning the sensor data into intervals, then labeling the partitions according to the actions each represents or as a non-action. The temporal partitioning stage is called segmentation and the labeling is called classiﬁcation. While many eﬀective methods exist for classiﬁcation, segmentation remains problematic. We present a technique inspired by continuous speech recognition that combines segmentation and classiﬁcation using Hidden Markov Models. This technique is distributed and only involves limited data sharing between sensor nodes. We show the results of this technique and the bandwidth savings over full data transmission.

1

Introduction

The capabilities of small electronic devices have increased exponentially as their sizes and prices have dropped. Uses that seemed frivolous or overly expensive are becoming practical and even cheap. Cell phones can now record video and transmit them wirelessly to personal websites and cars can automatically notify paramedics of a crash. One exciting platform with similar potential is the Body Sensor Network (BSN), in which several intelligent sensing devices are placed on the human body and can perform collaborative sensing and signal processing for various applications. Currently, these sensing devices are too cumbersome for casual use. However, the threshold for wearability depends on the application. For instance, stride variability is associated with the occurrence of Alzheimer’s [1]. If a patient could wear a sensor on their leg that can help a doctor evaluate the eﬀectiveness of their medication in a naturalistic setting, the inconvenience might be worth it. Further, these devices are getting smaller and more powerful every year, so wearability is unlikely to remain a long-term problem. Therefore, now is the time to investigate applications so that hardware designers can optimize their devices for the more useful applications. One use of BSNs is action recognition, in which the actions of the person wearing the sensors are identiﬁed. Action recognition is necessary for many other applications. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 145–158, 2009. c Springer-Verlag Berlin Heidelberg 2009

146

E. Guenterberg et al.

Techniques exist for extracting stride variability, but the output is only correct if the person is walking [2]. Also, action recognition could be used to develop an activity log for a person to help them or their doctor assess health [3] or avoid dangerous actions, which might be useful for RSI suﬀerers. Action recognition could even be used to help provide contextual interfaces to other devices [4]. A major problem in action recognition has been segmentation of the data into discrete actions. We address this problem using a Hidden Markov Model conﬁguration inspired by similar models in speech recognition and gesture recognition. These models are adapted for distributed processing on a wireless sensor network and for the speciﬁc application of action recognition and segmentation.

2

Related Work

Several approaches to action recognition have been proposed. A common problem is not the recognition, but the segmentation of data into actions. Often in image recognition, this is done without speciﬁc knowledge of what the image contains, but for action recognition using inertial sensors, it is generally not possible to infer a segmentation without some knowledge of what is being segmented. Various approaches address this problem in diﬀerent ways. Ward et al. recognized several workshop activities such as taking wood out of a drawer, putting it into the vice, getting out a hammer, and more. They avoided the problem of segmenting accelerometer data by segmenting the data using the presence or absence of sound, and then identiﬁed the action using accelerometer data and an HMM classiﬁer [5]. Their results showed the eﬀectiveness of this technique for a shop, but in many other situations, actions are not correlated with sounds. Another approach used a k-Nearest Neighbor (k-NN) classiﬁer and several statistical features to classify actions using a minimum number of sensor nodes [6]. Manual segmentation was used to avoid introducing errors from segmentation. This is a good technique for isolating the performance of various parts of a system, but for a deployed system, a satisfactory segmentation scheme is necessary. Alternatively, it is possible to try a number of segmentations and choose the best. In [7], 3-D motion capture data is used. Given a start and end time, each joint uses an HMM to identify the action. AdaBoost is then used to make a global decision using the HMMs as weak classiﬁers. A dynamic programming algorithm chooses the best segmentation according to their maximum-likelihood function in O(T 3 ) time, where T is the number samples. This scheme performs well if all computation is done on a single machine, but when each HMM is employed on a separate sensor node, the communication overhead required to try the diﬀerent segmentations is quite high. The ﬁnal method uses ﬁxed-size segments which are classiﬁed independently [8]. This can result in outliers and discontinuities. Many methods involve some sort of smoothing function [9,10,11]. One such method uses AdaBoost to enhance sevral single-feature weak classiﬁers. An HMM uses the conﬁdence output of the AdaBoost classiﬁer as input. A separate HMM is trained for each action class, and the overall segmentation/classiﬁcation is chosen based on the

Distributed Continuous Action Recognition Using a Hidden Markov Model

(a) Sensor node

147

(b) Sensor Placement

Fig. 1. Sensor node and placement

maximum likelihood among the various HMMs [12]. This is somewhat similar to our approach, except our model is based on a single HMM, which allows us to rule out impossible sequences of actions and to avoid outliers that could result from one model temporarily having higher probability than the others. The main contribution of our algorithm is eﬃciently producing a segmentation and classiﬁcation, and performing this processing on a distributed platform.

3

Data Collection Hardware

This paper presents a scenario where the actions a subject performs are identiﬁed from continuous data provided by a BSN. The sensor nodes are embedded computing and sensing platforms with inertial sensors, wireless communication capability, a battery, and limited processing capabilities. Sensor nodes must be placed at multiple locations on the body to capture suﬃcient information to accurately determine the action. As an example, the action “placing something on a shelf” and “standing still” produce similar sensor data on the leg, but diﬀerent data on the arms, while “turning to look behind” and “turn 90◦ ” show similarities from the shoulder but diﬀerences from the legs. The nodes communicate with a basestation, where a ﬁnal conclusion is reached. 3.1

Sensing Hardware and Body Placement

Fig. 1a shows one of the sensor nodes used to collect data for this paper. The sensor nodes use the commercially available moteiv TelosB with a custom-designed sensor board and are powered by two AA batteries. The processor is a 16 bit, 4 MHz TI MSP430. The sensor board includes a tri-axial accelerometer and a bi-axial gyroscope. Data is collected from each sensor at about 20 Hz. This frequency was chosen empirically as a compromise between sampling rate and packet loss. The sensor nodes are placed on the body as shown in Fig. 1b. Placement was chosen so that each major body segment is monitored with a sensor. While

148

E. Guenterberg et al.

we expect that nodes placed at a subset of these locations would be suﬃcient for accurate classiﬁcation of all considered actions, no formal procedure was performed to select such a reduced set. Discovering such procedures could prove to be a fertile area for future research. 3.2

Constraints and Deployment Architecture

The goal of this research was to ﬁnd a computationally realistic algorithm to segment and classify actions. To this end, data collected on each sensor node was broadcast to a basestation and recorded for later processing in the MATLAB environment. This gave us the most ﬂexibility for developing and testing diﬀerent signal processing and classiﬁcation schemes. Implementation of this system on a deployed BSN is the subject of future research. The algorithms presented here assume the following deployment architecture: the sensor nodes are placed on the body as shown in Fig. 1b. Each can communicate directly with the basestation. The nodes have a limited power supply and must last a long time between recharges, so power must be conserved. The basestation is a cell phone or PDA that has greater processing capabilities and can use signiﬁcantly more power. Wherever the ﬁnal classiﬁcation occurs, it must be transmitted to the basestation for storage or long-range communication. Communication uses signiﬁcantly more power than processing [13,14], so limiting communication is key to conserving power. Also, while the basestation is more powerful than a sensor node, it is not as powerful as desktop computers, so algorithms designed to be run on the basestation should be of low computational order. 3.3

Actions Collected

The actions considered are mostly transitional actions. Each starts and ends with a posture. The actions are shown in Table 1. Not all postures sharing the same label are exactly the same. For instance, for “stand” at the end of actions 19a, 20a, and 21a, one or both hands are on a shelf, whereas for most other cases “stand” means standing with hands resting at the side. The probabilistic nature of HMMs allows both postures to be represented by the same state. The actions were collected from three subjects, who performed each action 10 times. The data is divided into a training set and a testing set, with approximately half the trials used for training and half for testing. The data were manually segmented to label the start and the end of each action. This manual labeling represents the ground truth and is used to create sequences of known actions to both train the model and determine its accuracy.

4

Classification Model

One of the most diﬃcult problems in classiﬁcation is trying to label all the actions in a continuous stream of data in which both the timing of the actions and the

Distributed Continuous Action Recognition Using a Hidden Markov Model

149

Table 1. Actions Captured ID 1 2 3 4 5 6 7 8 9 10 11a 11b 12a 12b 13 14 15a 15b 16a 16b 17a 17b 18a 18b 19a 19b 20a 20b 21a 21b 22 23 24 25

Initial Posture Stand Sit Stand Sit Sit Lie Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Kneel Stand Kneel Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand

Action Stand to Sit (Armchair) Sit to Stand (Armchair) Stand to Sit (Dining Chair) Sit to Stand (dining chair) Sit to Lie Lie to Sit Bend and Grasp from Ground (R Hand) Bend and Grasp from Ground (L Hand) Bend and Grasp from Coffee Table (R Hand) Bend and Grasp from Coffee Table (L Hand) Turn Clockwise 90◦ Return from 11a Turn Counter-Clockwise 90◦ Return from 12a Look Back Clockwise and Return Look Back Counter-Clockwise and Return Kneeling (R Leg First) Return from 15a Kneeling (L Leg First) Return from 16a Move Forward 1 Step (R leg) Move L Leg beside R Leg Move Forward 1 Step (L Leg) Move R Leg beside L Leg Reach up to Cabinet (R Hand) Return from 19a Reach up to Cabinet (L Hand) Return from 20a Reach up to Cabinet (Both Hands) Return from 21a Grasp an Object (1 Hand), Turn 90◦ and Release Grasp an Object (Both Hands), Turn 90◦ and Release Turn Clockwise 360◦ Turn Counter-Clockwise 360◦

Final Posture Sit Stand Sit Stand Lie Sit Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Kneel Stand Kneel Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand Stand

labels are unknown. This problem has been considered many times in speech recognition, and is called continuous speech recognition [15]. The problem of speech recognition is similar enough to action recognition that many techniques used for speech recognition can be applied, with appropriate modiﬁcations, to action recognition tasks. [16] presents a model based on Hidden Markov Models (HMMs) where each word is represented by a separate left-right HMM1 . These are combined into a single HMM by creating a null state which generates no output. Each word starts from this null state and ends on the null state. A very similar approach is used for gesture recognition from hand sensors in [17]. We took this model and adapted it to ﬁt within the constraints imposed by our BSN conﬁguration and to more eﬀectively solve the problem of action recognition. One particular change made is that each action is assumed to start with some posture such as kneeling or standing and end with a posture. The postures are not null states; that is, there is an output associated with a posture, and a posture may persist for a period of time. The input for this system is a subject performing movements. These movements can be in an arbitrary order. The output is a segmentation and a set of labels for each segment. 1

Hidden Markov Models assume a process starting in a state which at each discrete time, generates an output and then transitions to a new state. The state transition and output are probabilistic and based exclusively on the current state. The output can be observed, but not the state. Algorithms exist to train a model to a given set of output sequences and to infer the state sequence given an output sequence and model. A left-right model restricts transitions to self-transitions or the next state in sequence. See [15] for more information.

150

4.1

E. Guenterberg et al.

Overview

Classiﬁcation requires a number of signal processing steps that execute on the sensor nodes and the basestation as shown in Fig. 2. The system is designed to accurately classify actions, with limited communication and use of processing power. The data is processed on a moving window centered on the current sample. The window moves forward one sample at a time. 1. Sensor Data: Data from ﬁve sensors is collected. The accelerometer senses three axes of acceleration: ax , ay , and az . The gyroscope senses angular velocity in two axes: θ and φ. There is no angular velocity from the axis orthogonal to the plane of the sensor board. 2. Feature Extraction: For each sample time, a feature vector is generated. The following features are extracted from a ﬁve sample window for each sensor: mean, standard deviation, rms, first derivative, and second derivative. 3. Transcript Generation: Instead of transmitting the feature vector to the basestation, each sensor node labels a sample using a single character from a small alphabet. Each sensor has a unique alphabet with between two to nine characters. Characters often repeat for several samples allowing for signiﬁcant compression. The sequence of labels produced are Motion Transcripts. Transcript generation uses Gaussian Mixture Models (GMM) to label samples based on clusters generated from the training data. 4. Hidden Markov Model: The HMM uses the model shown in Fig. 3. In the middle are actions which are modeled as left-right HMMs with between Mw = 1, 2, · · · , 10 states. The postures on the left and right are each modelled using a single state. The duplicated postures represent the same state. Postures and actions are connected as shown to form a single HMM. 5. Generating Output: When trying to segment and classify data, the Viterbi algorithm [15] is used to ﬁnd the most likely state sequence for the given output. The Viterbi is used because it ﬁnds the optimal sequence eﬃciently. Each sample is labeled with the name of the action or posture of the associated state. This output is the “generated annotations”.

Fig. 2. Signal Processing Model

Distributed Continuous Action Recognition Using a Hidden Markov Model

151

Fig. 3. HMM for Continuous Action Recognition for a subset of actions

4.2

Transcript Generation

Reducing data before transmission can save considerable power in a BSN. Transcripts do this by reducing the multi-dimensional per-sample observations on a sensor node to a single character taken from a small alphabet. Transcripts are inspired by the idea that actions can be represented by a sequence of motions. Motions can be identiﬁed from a small interval of observations. A single motion or position is likely to persist for some time, allowing run-length encoding to further reduce transmitted data. We have no canonical list of motions, therefore a technique is needed that does not require human input. One solution is unsupervised clustering, which automatically groups points based on underlying patterns in the data. Once these groups are created from training data, later observations may be assigned to one of the existing groups. In our system, the “points” are feature vectors in F -dimensional space. The most common clustering techniques include hierarchical clustering [18], k-means clustering [19], and model-based clustering [20,21]. Model-based clustering assumes that all points have been generated from a set of distributions. The Gaussian Mixture Model (GMM) is a model-based clustering using Gaussian distributions. Many realistic processes actually generate output based on Gaussian distributions, and many more can be approximated by a small number of Gaussian distributions. This causes GMMs to often outperform other methods of clustering [20]. For these reasons, GMMs are used for transcript generation. We assume a diagonal covariance matrix to reduce computational complexity for labeling and because the limited training data makes the accurate estimation of non-diagonal elements unlikely. GMMs are trained using an Expectation-Maximization (EM) procedure [21]. This iteratively converges to a locally optimal clustering. The initial model and the choice of the number of mixtures (M ) aﬀect the ﬁnal quality of the clusters. A common method of choosing M is to use EM to train several models using diﬀerent values of M and starting distributions. Then the models can be compared using some measure, and the best is selected. We use the Bayesian In-

152

E. Guenterberg et al.

formation Criterion (BIC) to compare models [20]. The clustering is performed independently on each node, and each node may have a diﬀerent number of clusters. 4.3

Hidden Markov Model

After each sensor node assigns a character to each sample, the actions can be determined on the basestation using the HMM shown in Fig. 3. An action starts and ends on a posture. The postures on the left and right with the same names in Fig. 3 are drawn twice for clarity, but represent the same state. Actions are individually represented as left-right models with some number of states determined using Bayesian Information Criterion as described below. Training the Model. An HMM has M states, and is deﬁned by the model λ, consisting of three sets of probabilities. λ = {πi , aij , bj (k)}

(1)

The probability that a sequence begins with state si is πi . The transition probability aij is the probability that a state transitions to state sj after starting on state si . For discrete observations, bj (k) gives the probability that observation vk is emitted at state sj . For our system, the left-right model for each action and posture is trained independently, then all actions and postures are joined to form a single HMM. The model for each action is trained by starting with an initial model and iteratively improving it using the Baum-Welch procedure [15]. The Baum-Welch procedure ﬁnds a local minimum. By trying a number of variations and selecting the best model according to some measure, the likelihood of ﬁnding a global minimum is increased. As with GMMs, a common technique for model selection is BIC [22,23]. We try Mw ∈ 1, 2, · · · , 10. For each of these ﬁxed-state models, we start by dividing samples in each sequence evenly among states, then iterating the EM algorithm ﬁve times to converge the model. For the next nine trials for the current Mw , each state is initially assigned a random number of samples. The best of these 100 models is used to represent the action. Each component of the HMM model in (1) must be trained. Since the action must start at the ﬁrst state, m1 , the initial probabilities πi have the following values: 1 if i = 1 πi = (2) 0 otherwise and aij and bj (k) are trained using the Baum-Welch algorithm, as described in [15] with the following modiﬁcations. The observation from a sensor node fi is considered to be independent of the observations from all other sensor nodes fj ∈ F , fj = fi . This means that b(f ) (k)j is computed for each sensor node separately, and the overall observation probability is: (f ) bj (k) = bj (k) (3) f ∈F

Distributed Continuous Action Recognition Using a Hidden Markov Model

153

Right after the action ﬁnishes, it is expected to immediately transition to the proceeding posture. Therefore, during training, only state sequences ending in the ﬁnal state, sMw should be considered. This can be accomplished simply if the observation probability takes the sample number into account, and makes the probability 0 for all observations from a state other than the ﬁnal state for the ﬁnal observation in a given sequence: 0 if j = Mw and t = T bj (k, t) = (4) bj (k) otherwise Bayesian Information Criterion. Selecting the best among several models is a common problem in statistical pattern recognition. In general, a model with a greater number of parameters will better ﬁt any set of training data, but runs the risk of ﬁtting eccentricities in the training data not present in the test data. The Bayesian Information Criterion [23] is a method that only requires training data and has strong probabilistic properties. ˆ − α K log T BIC(λ) = log p(X|λ, θ) 2

(5)

ˆ can be computed in a straightforward manner, as given in For HMMs, p(X|λ, θ) [15]. K is the number of free parameters, T is the total number of samples in the training set, and α is a regularizing term that allows control over the overﬁtting penalty. We chose a value of α = 0.05, in which each action was represented with an average of three states. 4.4

Joining Models

The models for each action are joined into a single HMM. Posture self-transition probabilities are derived from the training data, while the probability of transition from a posture to all associated actions are considered to be equal. 4.5

Runtime Order and Comparison to Other Methods

At runtime, there are two primary stages: transcript generation, and HMMbased recognition. For transcripts, the probability that the feature vector at each sample was generated by each cluster is calculated. The sample receives the label of the highest probability cluster. Because a diagonal covariance matrix is used, calculating probabilities is a linear function of the number of features. The order of clustering is O(T · F · Ci )

(6)

where T is the number of samples considered, and F is the number of features in the feature vector, and Ci is the number of clusters on sensor node i.

154

E. Guenterberg et al.

The HMM uses the Viterbi algorithm [15] for classiﬁcation and segmentation. Since the HMM consists of several joined left-right models, the simpliﬁed Viterbi runs more eﬃciently. The order of the HMM classiﬁer is O(T · Ma · K)

(7)

In (7), the total number of action states is Ma and K is the number of sensor nodes. This method must be compared to other methods that segment and classify data, and cannot be compared with methods that rely on external data to segment the data. The method in [7] is O(T 3 ), and so has a lower runtime eﬃciency, and the approach is not designed for sensor networks, and thus implies a large number of transmissions. Authors of [24] propose a system that can be either used on ﬁxed segments, or can adaptively choose a segmentation. Both are linear with respect to the number of samples, however there are large constant factors, such as the number of length hypotheses, and repeated computation of an inverse matrix for each hypothesis. Finally, there are a number of methods based on ﬁxed segmentation. These also can do no better than linear time. However, the associated constant factors may be smaller than for our model. These methods do not take temporal characteristics into account, so actions that use the same motions for a portion of time will be indistinguishable even if the sequence is unique for each action.

5

Results

For our experiment, three subjects performed the actions listed in Table 1, using the body sensor conﬁguration in Sect. 3. The system was trained using approximately half the data. An action was considered properly labeled if over 50% of the action was labeled as the action and the rest was labeled as either the start or the end posture. If any part of each action was labeled as a diﬀerent action, then the action is considered to be incorrectly labeled. Table 2 shows the result for each subject when a single model was trained on all subjects. The results are shown with two systems: clusters sent from each node, Table 2. Results for All Subjects Trained on One Model Subject Full Samples Acc. Clustering Acc. 1 78.1% 92.6% 2 38.1% 82.6% 3 10.9% 83.2%

Table 3. Results for All Subjects Trained Individually Subject Full Samples Acc. Clustering Acc. 1 95.1% 90% 2 48.9% 94% 3 47.4% 94%

Distributed Continuous Action Recognition Using a Hidden Markov Model

155

Table 4. Results for an Independently Trained Subject 1 Action # States Accuracy # Actions Confusion 1 6 100% 12 2 5 100% 12 3 4 100% 11 4 4 100% 11 5 3 100% 10 6 5 100% 10 7 4 100% 5 8 4 100% 5 9 3 100% 5 10 3 80% 5 8:1 11a 4 60% 5 12b:2 11b 3 100% 5 12a 3 0% 5 11b:5 12b 3 100% 5 13 6 100% 5 14 7 67% 6 19a:1, 19b:1 15a 6 100% 5 15b 3 100% 5 16a 4 100% 5 16b 4 100% 5 17a 3 100% 6 17b 3 60% 5 18a:2 18a 1 80% 5 17b:1 18b 3 100% 5 19a 2 80% 5 11b:1 19b 3 100% 5 20a 2 60% 5 18a:1, 21a:1 20b 2 20% 5 10:1, 17a:1, 21b:2 21a 2 100% 5 21b 3 100% 5 22 10 100% 6 24 4 100% 5 25 4 100% 5 Overall 90%

or full features sent from each node and quantized into 9 bins. The clustering signiﬁcantly outperforms the full set of samples. This is probably because the assumption of feature independence is especially ﬂawed with so many features, many of which may be equivalent to noise. To use a full set of features, some form of feature selection or conditioning is necessary. With clustering, accuracy for each subject is reasonable, especially given the similarity of the movements. However, visual inspection of the transcripts shows consistency within subjects, but marked diﬀerences between subjects for the same movements. This suggests a second approach: independently train the model on a per-subject basis. The results for this are shown in Table 3. The improvement for most subjects is considerable, though subject 1 actually has a lower accuracy when trained individually. The accuracy for the training set is 100%, which strongly suggests overﬁtting. Having more training samples could combat this problem. The biggest disadvantage is the requirement of training data for each subject. As with the previous results, the clustering method outperforms the system using full features. Detailed results for Subject 1 as displayed in Table 3 are shown in Table 4. The confusion column shows especially interesting results. Some of the misclassiﬁcations are expected: picking an object oﬀ the ground and oﬀ a coﬀee table are very similar, and so the confusion of actions 10 and 8 make sense. Similarly, turning counter-clockwise 90◦ is quite similar to returning from a clockwise turn. However, the confusion of “Reaching up to a cabinet with left hand” and “Move forward one step” makes little sense, and so represents a true error. A visual representation of the segmentation and classiﬁcation process for subject 3 is shown in Fig. 4. The clusters are on the bottom. The labels in red

156

E. Guenterberg et al. j 1 stand 11b 0.9 stand

11b stand

11a stand

11b

11a 11b

stand 11b stand

11a stand

stand

11a 0.8

stand

11b stand

11a

stand

stand

stand

11a

stan

11a

stand

11b

stand

11a

stand

11b

11a stand

11b

11b stand

stand

11b

11b

stand

stand

11a

Left Ankle 0.7 Left Thigh 0.6 Right Ankle 0.5 Right Thigh 0.4

Left Forearm

0.3 Left Arm

0.2

Right Forearm

Right Arm 0.1 Waist 0

50

100

150

200 250 Sample Number

300

350

400

Fig. 4. Classiﬁcation results for subject 3 Table 5. Data Savings from Clustering Subject Uncompressed (B/s) Samples Cmp. (B/s) Clustering Cmp. (B/s) 1 165.00 10.91 2.78 2 165.00 11.93 2.97 3 165.00 13.44 3.21

are the canonical annotations (gold standard), while the ones above in blue are generated annotations (system output). The grayscale bar at the top represents the progression of states. Movements 11a and 11b are “Turn counter clockwise 90◦ ” and “return”. The clusters from the left thigh show a very consistent pattern, while the clusters from the waist and right arm show signiﬁcant variation. The HMM is able to accurately identify these actions from among all possible actions, as can be seen from the labeling at the top. 5.1

Bandwidth Savings from Clustering

The primary reason for choosing clustering instead of transmitting samples directly was decreasing transmissions. In Table 5, the savings are shown. The results for the ﬁrst column are based on the uncompressed, 12 bit-per sensor data being transmitted, with results shown in Bytes per second. For the next column, sensor data is ﬁrst quantized (with nine possible bins per sensor), then compressed with run-length encoding. The results shown are the average entropy per original sample. Methods such as Huﬀman’s encoding, come close to achieving entropy, so this is a reasonable estimate of bandwidth. The ﬁnal column is similar, except instead of quantized sensor data, clustering is performed. The savings is most dramatic when compression of any kind is applied, however, clustering still reduces the bandwidth by about 75%.

Distributed Continuous Action Recognition Using a Hidden Markov Model

6

157

Conclusion and Future Work

In this paper, we presented an action recognition framework based on an HMM, which is capable of both segmenting and classifying continuous movements. It is speciﬁcally designed for the distributed architecture of Body Sensor Networks, and has a modest runtime requirements, which is essential for the resourcelimited sensor nodes. The error is consistent with results reported from similar experiments described in literature. Our segmentation and classiﬁcation techniques are developed from a signalprocessing standpoint. For deployment, several additional steps must be taken. First, in a system designed to monitor a subject throughout the day, many actions performed by the subject will not represent any of the trained actions. The system will need to not only recognize known actions, but reject unknown actions. [25] suggests a method based on rejection thresholds that can be used. Second, the processing tasks need to be implemented on a BSN. Since our MATLAB tests proved successful, this is our next major step.

References 1. Hausdorﬀ, J., Cudkowicz, M., Firtion, R., Wei, J., Goldberger, A.: Gait variability and basal ganglia disorders: stride-to-stride variations of gait cycle timing in Parkinson’s disease and Huntington’s disease. Mov. Disord. 13(3), 428–437 (1998) 2. Aminian, K., Najaﬁ, B., B¨ ula, C., Leyvraz, P., Robert, P.: Spatio-temporal parameters of gait measured by an ambulatory system using miniature gyroscopes. Journal of Biomechanics 35(5), 689–699 (2002) 3. Nait-Charif, H., McKenna, S.: Activity summarisation and fall detection in a supportive home environment. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 4 (2004) 4. Castelli, G., Rosi, A., Mamei, M., Zambonelli, F.: A Simple Model and Infrastructure for Context-Aware Browsing of the World. In: IEEE International Conference on Pervasive Computing and Communications, pp. 229–238 (2007) 5. Ward, J., Lukowicz, P., Tr¨ oster, G., Starner, T.: Activity Recognition of Assembly Tasks Using Body-Worn Microphones and Accelerometers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1553–1567 (2006) 6. Ghasemzadeh, H., Guenterberg, E., Gilani, K., Jafari, R.: Action coverage formulation for power optimization in body sensor networks. In: Design Automation Conference, ASPDAC 2008, Asia and South Paciﬁc, pp. 446–451 (2008) 7. Lv, F., Nevatia, R.: Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, p. 359. Springer, Heidelberg (2006) 8. Bao, L., Intille, S.: Activity Recognition from User-Annotated Acceleration Data. In: Ferscha, A., Mattern, F. (eds.) PERVASIVE 2004. LNCS, vol. 3001, pp. 1–17. Springer, Heidelberg (2004) 9. Bao, L.: Physical Activity Recognition from Acceleration Data under SemiNaturalistic Conditions. PhD thesis, Massachusetts Institute of Technology (2003) 10. Van Laerhoven, K., Gellersen, H.: Spine versus Porcupine: a Study in Distributed Wearable Activity Recognition. In: Proc. of the Eighth IEEE Intl. Symposium on Wearable Computers, vol. 1, pp. 142–149

158

E. Guenterberg et al.

11. Courses, E., Surveys, T., View, T.: Analysis of low resolution accelerometer data for continuous human activity recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2008, pp. 3337–3340 (2008) 12. Lester, J., Choudhury, T., Kern, N., Borriello, G., Hannaford, B.: A hybrid discriminative/generative approach for modeling human activities. In: Proceedings of International Joint Conference on Artiﬁcial Intelligence (IJCAI 2005) (2005) 13. Akyildiz, I., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Computer Networks 38(4), 393–422 (2002) 14. Polastre, J., Szewczyk, R., Culler, D.: Telos: enabling ultra-low power wireless research. In: Proceedings of the 4th international symposium on Information processing in sensor networks. IEEE Press, Piscataway (2005) 15. Rabiner, L., Juang, B.: An introduction to hidden Markov models. ASSP Magazine, IEEE 3(1 Part 1), 4–16 (1986) 16. Jurafsky, D., Martin, J., Kehler, A., Vander Linden, K., Ward, N.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. MIT Press, Cambridge (2000) 17. Lee, H., Kim, J.: An HMM-Based Threshold Model Approach for Gesture Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 961–973 (1999) 18. Johnson, S.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967) 19. Hartigan, J., Wong, M.: A K-means clustering algorithm. JR Stat. Soc. Ser. CAppl. Stat. 28, 100–108 (1979) 20. Fraley, C., Raftery, A.: How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal 41(8), 578–588 (1998) 21. Figueiredo, M., Jain, A.: Unsupervised Learning of Finite Mixture Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 381–396 (2002) 22. Stoica, P., Selen, Y.: Model-order selection: a review of information criterion rules. Signal Processing Magazine, IEEE 21(4), 36–47 (2004) 23. Biem, A.: A model selection criterion for classiﬁcation: Application to hmm topology optimization. In: Seventh International Conference on Document Analysis and Recognition, vol. 1, pp. 104–108 (2003) 24. Yang, A., Jafari, R., Sastry, S., Bajcsy, R.: Distributed Recognition of Human Actions Using Wearable Motion Sensor Networks. Journal of Ambient Intelligence and Smart Environments 1, 1–5 (2009) 25. Yoon, H., Lee, J., Yang, H.: An online signature veriﬁcation system using hidden Markov model in polar space. In: Proceedings of Eighth International Workshop on Frontiers in Handwriting Recognition, 2002, pp. 329–333 (2002)

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks Anthony D. Wood and John A. Stankovic Department of Computer Science University of Virginia {wood,stankovic}@cs.virginia.edu

Abstract. Bulk transport underlies data exﬁltration and code update facilities in WSNs, but existing approaches are not designed for highly lossy and variable-quality links. We observe that Maymounkov’s rateless online codes are asymptotically more eﬃcient, but can perform poorly in the WSN operating region. We analyze and optimize coding parameters and present the design and evaluation of RTOC, a protocol for bulk transport that recovered over 95% of application data despite up to 84% packet loss in a MicaZ network.

1

Introduction

Often wireless sensor networks (WSNs) must reliably transfer large amounts of data, which is challenging given the typical resource constraints of WSN devices. They may be deployed in adverse circumstances where poor and highly variable link quality is caused by dynamic environmental factors such as heat and humidity, by low-cost hardware and its concomitant failure or unreliability, or by obstacles and RF interference (accidental or malicious). Whether for extracting sensor data or loading new code in over-the-air reprogramming, bulk data must be transmitted eﬃciently to reduce wasted computation and communication. These twin problems of loss-tolerance and eﬃciency are not suﬃciently addressed by the state of the art. Existing protocols use various methods to conceal or overcome loss of data blocks. The approaches taken by Deluge [1], RCRT [2], and Flush [3] are based on Automatic Repeat Request (ARQ), in which ACKs or NACKs explicitly request retransmission of lost data. However, in severe conditions ARQ protocols require many retransmissions and have high latency, as determined by Kumar [4] for TCP in lossy networks. Another pragmatic approach to achieving reliability in this setting is to bound the expected error rate δ and use forward error correction (FEC) for transmitting blocks of the data. For predictable channel conditions, a code may be chosen that is a trade-oﬀ between overhead and performance, and it has been proven that codes exist with rate equal to the channel capacity 1 − δ. However, under

This work was funded in part by ARO grant W911NF–06–1–0204, NSF grants CNS– 0614773 and CNS–0614870.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 159–172, 2009. c Springer-Verlag Berlin Heidelberg 2009

160

A.D. Wood and J.A. Stankovic

intermittent interference or other lossy conditions, the channel may be arbitrarily bad, and for any error rate greater than δ, a ﬁxed-rate code fails completely and the block or message is lost. Pessimistically chosen parameters suﬀer from high overhead which must always be paid. These limitations motivate the use of rateless erasure codes, also called fountain codes. They have recently attracted attention for use in WSNs [5, 6] due primarily to these properties: ﬁrst, a limitless number of encoded symbols can be generated from an input of k symbols, and second, the original k symbols can be recovered from any k = (1 + )k received encodings (asymptotically, for ﬁxed ). Theoretically, no feedback channel is needed, such as for the ACKs and NACKs of an ARQ protocol. The sender can transmit an endless stream of encoded symbols and even for an arbitrarily poor channel (as long as δ < 1), the receiver eventually receives k symbols and can decode the message. Such an encoding scheme is optimal if k = k. Rateless Deluge [5] uses random linear codes which require multiplication modulo an irreducible polynomial to decode. Extra memory is needed for storing inversion tables to achieve practical execution speeds, and both encoding and decoding are complex for typical WSN platforms. Luby’s LT codes [7], which are the basis of SYNAPSE [6], more eﬃciently encode packets using exclusive-OR operations at the sender, but also require Gaussian elimination for decoding, which cannot proceed until k blocks are received. We propose the use of online codes [8], which improve on LT codes to achieve O(1) time encoding (per block) and O(n) decoding, and which permit iterative decoding as packets are received. However, coding parameters recommended for Internet networks perform poorly in messaging overhead and memory consumption in the typical WSN operating region of relatively few data blocks. This prevents their direct replacement in existing protocols, and motivates this study. This work uses online codes to provide reliable data transfer despite highly lossy communication channels. To do so, it must address challenges in the selection of appropriate parameters for the coding scheme and requires a protocol design that minimizes round-trip interactions. Our contributions include: – We design Reliable Transfer with Online-Coding (RTOC), a novel transport protocol for WSNs that is the ﬁrst to employ online codes for higher decoding eﬃciency than SYNAPSE. It stays synchronized despite high loss rates, and uses feedback control to adaptively terminate data transmission without ARQ as in Deluge or manual FEC selection used by Rateless Deluge. – Through analysis of the online coding degree distribution and algorithm, we optimize parameters to trade asymptotic optimality for predictability within the WSN operating region. We achieve a 12% better eﬀective coding rate with 72% lower variance, which reduces the 98th percentile decoding memory requirements by 69%. – We evaluate the performance of RTOC on an implementation in TinyOS for the MicaZ platform, and show that block delivery ratios exceed 95% despite up to 84% packet loss. Overhead follows from the page fragmentation and eﬀective coding rate, and is low when channel loss is low.

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

161

In the next section we describe related work, and then present the design of our loss-tolerant transport encoding scheme in Section 3. Key coding parameters are analyzed in Section 4 for their impact on eﬃciency in WSNs. Evaluation of an implementation for MicaZ motes is given in Section 5. Finally, we conclude in Section 6.

2

Related Work

Methods exist for selecting high quality links to avoid poor communication [9] and for detecting inconsistencies to trigger code updates [10] in WSNs and are orthogonal to this work. Over-the-air reprogramming has been addressed by other schemes that predate Deluge [1], but the latter has become a popular choice despite its shortcomings. Recent work has attempted to improve its eﬃciency and performance using rateless codes as described above [5,6]. RTOC builds upon this work and adopts some features common to reprogramming protocols, but is modularized to allow its use for other purposes such as bulk data transport, and nothing precludes its use as an underlying mechanism for code updates. Flush [3] is an end-to-end transport protocol for WSNs that uses acknowledgments and rate control to achieve high goodput. Like RCRT [2], another rate-controlling transport protocol, it relies on round-trip messaging to drive the control algorithm. While this gives good performance at each hop when the channel loss is low, it performs poorly when many control messages are lost [4]. RTOC is designed to tolerate such losses in its feedback mechanism.

3

Reliable Transfer with Online-Coding

RTOC is a protocol for data transfer in networks that suﬀer from high and timevarying channel loss. The use of ﬁxed, high-rate FEC schemes pay a constant but high overhead, and existing rateless approaches rely on end-to-end interactions, a ﬁxed margin for loss tolerance, or incur relatively high decoding cost. Application data (e.g., sensor data or program code) are assumed to be stored in pages or messages and are fragmented into blocks by RTOC for encoding and transmission to one or more neighboring nodes. After one roundtrip exchange to initiate the transaction, encoded data is streamed to the destination. Feedback control is used to determine when to slow and terminate transmission to minimize wasted communication, without requiring multiple rounds of ARQ or assuming that “no news is good news.” Before describing our solutions for synchronization and termination, we review online codes in Section 3.1. 3.1

Online Codes

Online codes [8] are non-systematic fountain codes, developed independently from but similar to Raptor codes [11]. They concatenate two codes (outer and inner) to produce a limitless stream of output blocks from n original message or page blocks. Online codes improve on Luby’s LT codes [7] (used by SYNAPSE [6]) to

162

A.D. Wood and J.A. Stankovic

Fig. 1. Sending a message M through an erasure channel with unbounded loss using online codes. Steps (b)–(e) show iterative belief propagation as blocks are received.

achieve O(1) time encoding (per block) and O(n) decoding, trading optimal for near-optimal recovery performance. They are also locally encodable, which means that each output block is computed independently from the others, easing implementation and memory requirements on constrained WSN devices. Encoding consists of an outer or pre-processing encoding followed by an inner encoding that generates output blocks, called “check” blocks. The outer encoding creates a ﬁxed number q = kδn of “auxiliary” blocks that are appended to the original n message blocks (k and δ are parameters described below). For every message block, k auxiliary blocks are chosen and the message block’s contents are exclusive-ORed with them all. The inner encoding creates a potentially endless set of check blocks from the combined message and auxiliary blocks (the “composite” message). To generate a check block, a degree d is ﬁrst chosen by sampling a distribution ρ with certain properties described below. Then d composite message blocks are chosen uniformly and exclusive-ORed together to make each check block. Construction of each check block is independent of all previous and future blocks (a property called local encodability), so it is easy to implement, takes only constant time, and requires little memory. Figure 1(a) shows message (Mi ), auxiliary (Ai ), and check blocks (Ci ) connected in a graph G, where edges from M to A represent the outer encoding and edges from M ∪ A to C represent the inner encoding. Check blocks are transmitted through the erasure channel to the receiver for recovery. To decode the page or message, the receiver uses a belief propagation algorithm on the subset of graph G formed by the blocks successfully received: 1. choose a check block C of degree one, 2. recover the contents of its adjacent block Mx as Mx = C ⊕ Mi , for all Mi used to construct C originally and i = x, 3. remove all edges in G of the recovered block Mx , and 4. repeat until the message is recovered or all check blocks have degree > 1. Auxiliary blocks are decoded similarly, but after they have been recovered they are treated as check blocks and used to decode any remaining message blocks to which they are adjacent in G.

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

163

Figures 1(b)–(e) show an example of decoding steps at the receiver. Step (b) shows the state of graph G after check blocks C0 , C3 , and C4 have been received. The ﬁrst two blocks are buﬀered because their degrees are higher than one, but block C4 can be processed by the above algorithm upon its reception. In step (c), the contents of block C4 are copied to M1 , which is marked as recovered. Edges incident to M1 are deleted from graph G, so blocks C0 and C3 now have degree one. Block C3 is chosen next and used to recover block M0 = C3 ⊕ M1 in step (d), and the edges of M0 are removed. Check block C0 similarly is used to recover auxiliary block A0 = C0 ⊕ M1 in step (e), and edges between A0 and check blocks are removed. No more check blocks have degree one, so the algorithm terminates until another block is received. Iterative decoding spreads the total processing cost across multiple block receptions, which is more friendly to co-hosted real-time processes than is batch decoding after k blocks are received as in Rateless Deluge. 3.2

Synchronizing Sender and Receiver

Online and Raptor codes are in the family of fountain codes—so-called because they can generate endless streams of encoded blocks, and the receiver does not require any particular ones as long as a suﬃcient number of them are received. For unpredictable and arbitrarily low-capacity channels, this property allows RTOC to maintain communication. However, a mechanism is needed to shut oﬀ the ﬂow of encoded blocks when the message has been recovered by the receiver, but without resorting to multiple rounds of control traﬃc. To address these requirements, RTOC uses a lightweight protocol for synchronizing the parties, controlling the transmission rate, and terminating an exchange. We borrow the protocol nomenclature and sequencing from IEEE 802.11 messages, but redeﬁne the semantics. A Request To Send (RTS) message bears the transmitter’s block size b, total message length n, and a transactional nonce used to seed a pseudo-random number generator (PRNG). The message destination responds with a Clear To Send (CTS) message to acknowledge the RTS and indicate readiness to receive encoded fragments. Each encoded check block, or DATA message, bears the block’s identiﬁer, which partially determines the random selections used to construct it. DATA blocks are streamed to the destination, and when the original message has been successfully decoded, the neighbor returns an ACK message and the exchange completes. Senders and receivers must agree on the parameters of the online code, , δ, and k, and on ephemeral or transactional state as well. The construction of graph G determines the composition of auxiliary and check blocks and must be synchronized. In particular, the random selection of message to auxiliary block mappings, and the generation of random samples from distribution ρ for constructing check blocks must be performed identically by both parties. The sequence and dependencies of these steps are shown in Figure 2, viewing from left to right. The sender of the application message seeds PRNGS with a private value not shared with the receiver, though it need not be secret. This

164

A.D. Wood and J.A. Stankovic

Fig. 2. Senders and receivers use synchronized transaction state to compute losstolerant pseudo-random block mappings for the online code

PRNG is later used to generate a random identiﬁer for each check block that is sent. After exchange of the RTS and CTS messages, both parties generate the subset of G consisting of the kn edges deﬁning message block to auxiliary block mappings, using the transaction nonce to seed PRNGT . A ﬁnal value, the check seed, is generated from this PRNG to combine with each check block identiﬁer. Our solution of combining multiple generators satisﬁes several objectives. Sequentially numbering check blocks would produce a highly autocorrelated input for generating the check block contents, resulting in poor randomness when used with a linear-feedback shift register. For this reason the identiﬁers are randomly generated by the sender. However, they may be too short to produce a longperiod sequence because of constraints on identiﬁer length. They are therefore combined with the check seed and used to seed PRNGC for generating the check block degree d from ρ and the adjacent message and auxiliary blocks. As the check seed is derived from the transaction nonce, it also provides randomness among multiple messages sent by a single node. Receivers must be able to determine the contents of each check block independently to cope with loss. This is satisﬁed by using the identiﬁer, which is unique to the received block, together with the check seed, which is unchanging for all blocks created from a single application page or message. Separately seeding PRNGC with this combination ensures that both endpoints produce the same pseudo-random stream: ﬁrst the degree d, then d blocks to exclusive-OR together (at the sender) or mark as adjacent and decode (at the receiver). Hence, after a single round-trip exchange to begin the transfer, data ﬂows until an ACK stops it. These are retransmitted if necessary to overcome loss, but no other control messages are needed. 3.3

Stream Termination and Rate Control

Special attention must be given to terminating the stream of check blocks, which is potentially endless. When channel capacity is low, the eﬀective rate (n/c, where c check blocks were transmitted in total) necessary to recover the data may be quite low. Rather than ﬁx the number of check blocks to transmit, which assumes accurate knowledge of the loss rate or requires additional control messages to

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

165

ﬁnish, we use online rate control. This makes our protocol more robust to high loss rates of both data and control messages and to dynamic channel conditions. Every node maintains an estimate of the loss rate γˆ for each neighbor link to determine how many check blocks to transmit. When an application message is successfully transmitted to a neighbor at time t, the sender saves the number of check blocks ct that were required (as reported by the receiver in the terminating ACK message) and total message blocks n. Using the expectation from [12] that messages are recoverable with high probability from (1 + 3)n check blocks, the average loss rate for the completed transmission is computed as: γt = 1 −

(1 + 3) n ct

(1)

The estimated current channel loss to the neighbor is updated as an exponentially weighted moving average: γˆt+1 = αγt + (1 − α) γˆt

for 0 ≤ α ≤ 1

(2)

We estimate the channel loss instead of c directly because we allow the length n to vary freely among application pages or messages. When sending a message at time t + 1 to the same neighbor, the node transmits (1 + 3) n/ (1 − γˆt+1 ) check blocks at the nominal rate supported by the underlying MAC layer. If no ACK has been received to terminate the transmission by this time, the node reduces the sending rate, but continues to send check blocks up to some maximum tolerated cmax . The lower but sustained rate reduces overhead at the sender, while allowing for potentially high losses on the reverse link that interfere with ACKs. Prior to transaction termination with an ACK message, the receiver may periodically notify the sender of the number of check blocks received and message blocks decoded. The sender then updates γˆ to shorten or extend the duration of the full transmission rate period. However, as channel losses may be severe, the original γˆ is used if no updates are received. In contrast with ARQ protocols, which implode under retransmissions in lossy networks [4], RTOC’s transaction control mechanism tolerates high losses. Given that the original RTS and CTS messages are repeated suﬃciently many times to overcome channel loss γ and begin the online coding, the receiver can recover the message from any (1 + 3)n check blocks. No further acknowledgment is required, though it does prevent the sender from wasting transmissions up to the maximum tolerated cmax (equivalently, down to a minimum eﬀective rate n/cmax ). High loss conditions that would prevent acknowledgment delivery are also when the maximum number of check blocks are likely to be required, so the waste is small. Conversely, in good conditions when the potential waste cmax − (1 + 3)n is high, an ACK terminates the transaction promptly.

4

Design of Code Parameters

A key challenge remaining for the use of online codes is the selection of several inter-related encoding parameters that determine eﬃciency and suitability for

166

A.D. Wood and J.A. Stankovic F=2115

0.5

Probability density

Distribution ρ(d,ε,δ)

1000 F0.98 F

0.4

F

0.3

ε=0.2, δ=0.1, F=44 ε=0.01, δ=0.005, F=2115

0.2

100 F=47 F0.98=38

F0.98=25

10

0.1

0.01 0.05

0 1

2

3

4

5

6

7

8

9

10 11 12 13

Check block degree (d)

ε

0.1 0.15

F0.98=16 0.2 0.25

0.3

0.15

0.125

Contour: F = 35

0.1

0.075

δ

0.05

0.025 0.005

Contour: F0.98 = 35

(a) Probability density of ρ, with high den- (b) F and 98% bound F 0.98 for ρ. Contours sity in low degrees and a long, thin tail. show feasible region for n + q = 35. Small produce very small ρ1 densities. Fig. 3. Impact of parameters and δ on the check block degree distribution ρ and maximum degree F, computed from Equation 3

use in WSNs. Maymounkov’s analysis of the coding scheme and degree distribution ρ shows that the receiver can recover the original data with high probability after receiving (1 + 3)n check blocks [8]. Distribution ρ(d, , δ) is given as: 1 − 1/F (1 − ρ1 )F ln (/2) + ln δ ρ1 = , ρi = for 2 ≤ i ≤ F, F = (3) 1+ (F − 1)i(i − 1) ln (1 − δ) A small minimizes transmission overhead, however, it also skews distribution ρ to the right. This increases the average check block degree d and the decoding complexity which is proportional to n ln (/2). Figure 3(a) shows the probability density of ρ for = 0.2 and = 0.01, with δ = /2 in both cases. For = 0.01, the value recommended by Maymounkov, the maximum check block degree F given by Equation 3 is 2115—far exceeding the number of composite message blocks n + q needed in this context and consuming valuable memory space. The lookup table for sampling ρ requires up to 4230 B, which is more than the capacity of MicaZ’s SRAM. However, we note that ρ has a long, thin tail with more than half of its density concentrated in its ﬁrst two elements ρ1 and ρ2 . An implementation could truncate the distribution with little practical eﬀect on decoding performance if large values sampled from ρ are very rare. To gain a better understanding of the usable range of and δ parameters given WSN constraints, we numerically calculated the least degree d that bounds 98 percent of the cumulative probability density of distribution ρ: F0.98 = min d, such that i≤d ρi ≥ 0.98. Figure 3(b) shows F0.98 for values of ∈ [0.01, 0.3] and δ ∈ [0.005, 0.15], and indicates that small values of may be practical depending on the number of blocks n created by fragmentation. For example, a 480 B page sent as 16 B blocks creates about 35 composite blocks to be selected randomly from the distribution of ρ. Values of and δ for which F and F0.98 ≤ 35 are indicated in Figure 3(b)

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

167

Fig. 4. Contour map showing ρ1 for , δ ∈ [0.01 : 0.3]. Parameter settings in the upperleft region do not produce check blocks with degree one, which are required by the given decoding algorithm.

by contour lines. Truncating F makes a much larger parameter space available, as seen from the diﬀerence in the contours. Parameters k and δ aﬀect the online code performance in three ways. First, the number of auxiliary blocks created for each message grows as q = kδn, so for ﬁxed k, higher δ values require more memory buﬀers. Second, higher δ values increase the probability δ k that the message cannot be decoded until after (1+)n blocks. Although the foregoing argue against large δ, the last consideration is that small δ values, as for small values, extend the right-tail of ρ and increase decoding complexity. There is one further disadvantage of a small —and which is ultimately determinant—that aﬀects the use of an online code in our setting. Using the algorithm given above, recovery of the message cannot begin until a check block of degree one (i.e., a copy of a message block) is received. With relatively few message blocks to send in total, it may often happen that the ﬁrst check block to be randomly assigned degree one is sent very late, requiring the receiver to buﬀer check blocks well in excess of the asymptotically expected (1 + 3)n bound. The probability ρ1 = ρ (d = 1, = 0.01, δ = 0.005) of such a degree-one block is only 0.0094, as shown in Figure 3(a). For this low ρ1 density, there is a 30% chance that a check block of degree one is not sent until after 127 others, which delays decoding and increases buﬀer occupancy at the receiver. Maymounkov and Mazi`eres [12] make the simplifying recommendation that δ be chosen as /2. However, subject to the constraints and trade-oﬀs discussed, and δ may be varied independently. A contour map of ρ1 for values of and δ is shown in Figure 4. The upper-left region is infeasible for the given algorithm because ρ1 is either zero or very small, and must be avoided. Outside this region, the parameters may be chosen to yield good performance, as we now describe. Rather than optimizing for asymptotic behavior, selection of higher values leads to better performance in the operating region useful for WSNs, that is, for relatively small n. We implemented the encoding and decoding algorithm on a

168

A.D. Wood and J.A. Stankovic

1 Fraction of message decoded

Fraction of message decoded

1 0.8 n=120, ε=0.01, δ=0.005, k=3

P98% P25%-P75% median P2%

0.6 0.4 0.2 0

0.8 n=120, ε=0.15, δ=0.01, k=1

P98% P25%-P75% median P2%

0.6 0.4 0.2 0

0

40

80

120

160

200

240

280

320

360

0

40

80

Received check blocks

120

160

200

240

280

320

360

Received check blocks

(a) Message decoding progress for a 120- (b) The full 120-block message is decoded block message as check blocks are received. earlier on average for larger = 0.15, δ = Small = 0.01, δ = 0.005 cause high vari- 0.01, incurring less transmission overhead. ance in this region. 360

360

n=120, ε=0.01, δ=0.005, k=3

n=120, ε=0.15, δ=0.01, k=1

320

P98% P25%-P75% median P2%

280 240 200

Active check block buffers

Active check block buffers

320

160 120 80 40 0

P98% P25%-P75% median P2%

280 240 200 160 120 80 40 0

0

40

80

120

160

200

240

280

320

360

0

40

80

Received check blocks

(c) Number of active check block buﬀers for simulation shown in (a) above. For 2% of the tests, more than 320 buﬀers were required.

200

240

280

320

360

1 Effective coding rate

Effective coding rate

160

(d) Signiﬁcantly fewer active buﬀers were required by the simulation shown in (b) above, for more predictable memory management and buﬀer re-use.

1 0.8 0.6 0.4 0.2 0 0

120

Received check blocks

ε=0.01, δ=0.005, k=2

20

P98% median ± 25% P2%

40 60 80 Message length (blocks)

100

120

0.8 0.6 0.4 0.2 0 0

ε=0.15, δ=0.01, k=1

20

P98% median ± 25% P2%

40 60 80 Message length (blocks)

100

120

(e) As the message length n increases, the (f) Larger increases the ρ1 density, giveﬀective coding rate exhibits high variance ing a more consistent and overall higher due to a low ρ1 density from the small val- eﬀective coding rate in this region. ued parameters. Fig. 5. Simulation results show the inﬂuence of and δ on cumulative decoding progress, active check block buﬀers, and the eﬀective coding rate. The median, and 2nd, 25th, 75th and 98th percentiles of 500 runs are shown.

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

169

PC to measure the code performance trade-oﬀs for the parameters , δ, and k. From the data collected, we present the median, and 2nd , 25th , 75th and 98th percentiles of 500 randomized runs per data point. Figures 5(a) and 5(b) show the signiﬁcant diﬀerences in the cumulative decoding progress for = 0.01 and = 0.15, respectively, when receiving the check blocks transmitted for a message of n = 120 blocks. A small yields a small ρ1 and delays the decoding, as evidenced by the larger variance and extended recovery time shown in Figure 5(a). In 2% of the tests, very little of the message was recovered until after 340 check blocks (≈ 2.8n) had been received, when decoding rapidly proceeded. A larger = 0.15 parameter yields signiﬁcantly more compact and consistent performance and the ﬁnal decoding time is faster, as shown in Figure 5(b). This is particularly beneﬁcial on memory-constrained WSN devices, as buﬀers allocated to check blocks can be re-used when a block is decoded completely. Figures 5(c) and 5(d) show the large diﬀerence in dynamic check block buﬀer use, with 52% (vs. median) and 69% (vs. 98th percentile) less memory required than for = 0.01. Fixed memory overhead from the kδn auxiliary blocks is also kept low by parameters δ = 0.01 and k = 1. The eﬀective coding rate (i.e., ratio of the number of message blocks to check blocks) was measured for 2 ≤ n ≤ 120, and indicates that these eﬀects are even more pronounced for smaller message lengths. Figure 5(e) shows the wide variability in eﬀective rate for lengths smaller than the n = 120 case of Figures 5(a)–5(d). A linear ﬁt of the = 0.15 data shown in Figure 5(f) gives an eﬀective rate of 0.7–0.75 with smaller variance. Our analysis of the online code degree distribution ρ, the impact of its key parameters, and simulation results in the domain of WSN operation lead to the selection of a higher to reduce variance by 50–71% and a relatively small k and δ to reduce ﬁxed overhead. This enables implementation on memory-constrained devices, and trades asymptotic eﬃciency for good performance in RTOC, and allows the transport protocol to beneﬁt from online coding’s algorithmic advantage over other proposed rateless schemes.

5

Experimental Results

Having analyzed and optimized the online coding parameters, we evaluated the performance RTOC in an embedded implementation for the MicaZ mote to obtain the most realistic and accurate results possible. First, we consider the eﬀect of channel loss on link reliability and transmission overhead. We utilize a loss, or erasure, rate of γ on all protocol messages and fragments. In the following tests, a node transmitted 180 messages of length 96 B to a neighbor using block (fragment) lengths of 8 and 16 bytes, and we measured the performance under varying loss rates. We induced nominal loss rates of 0–75% by discarding blocks randomly in software on the MicaZ motes. Uncontrolled channel conditions at the time of the experiments further lowered the actual loss rates, which are shown in the ﬁgures.

170

A.D. Wood and J.A. Stankovic 0.7

Effective Coding Rate

Packet Delivery Ratio

1.00 0.95 0.90 fixed R=0.25 fixed R=0.50 0.85 fixed R=0.75 RTOC B=8 RTOC B=16 0.80

0.6

Receiver-side decoding rate

0.5 0.4 Sender-side encoding rate

0.3 RTOC B=8 RTOC B=16

0.2 0.1

0

0.25 0.5 Channel Loss Rate

0.75

(a) PDR for RTOC and ﬁxed-rate codes.

0

0.25 0.5 Channel Loss Rate

0.75

(b) Sender and receiver coding rates.

Overhead

Fig. 6. Eﬀect of channel erasures on packet delivery ratio and eﬀective coding rates, using parameters = 0.15, δ = 0.01, k = 1

13

RTOC B = 8 RTOC B = 16

10

fixed B = 8 fixed B = 16

7 4 1

0

0.25

0.5 Channel Loss Rate

0.75

Fig. 7. Message transmission overhead (ratio of bytes transmitted, including headers, to bytes of application payload) rises signiﬁcantly for high loss rates. Shown for comparison are ﬁxed-rate codes.

Despite actual channel plus induced loss rates of up to 84%, RTOC delivers in excess of 95% of packets. Figure 6(a) shows the mean and standard deviation of PDRs for the tests, which range from 95–100%. Induced losses were applied equally to data fragments and transaction control messages, which must be retransmitted to keep the protocol from stalling. Online codes are designed to decode a message with high probability after (1 + )n check blocks are received —which clearly requires many transmissions to overcome high loss rates. Figure 6(b) shows the eﬀective coding rates for both sender and receiver. The receiver’s rate is the ratio of the total number of message blocks to the number of check blocks received before decoding success. It is nearly constant in the range of 0.64 to 0.69 and is consistent with the simulation results presented earlier in Section 4. The sender’s encoding rate is the ratio of the total number of message blocks to the number of check blocks sent before protocol termination, and directly reﬂects the increasing channel loss, as it drops from 0.63 to about 0.12. Figure 7 shows the overhead of the RTOC protocol, fragmentation, and channel losses more directly. We measured the overhead as the ratio of all bytes transmitted, including fragment and TinyOS headers, for the original

Online Coding for Reliable Data Transfer in Lossy Wireless Sensor Networks

171

96 B payload length. High loss rates, as expected, require the most transmissions and incur high overhead. Larger block sizes are more eﬃcient because the overhead from headers is greatly reduced. For comparison, the behavior of ideal ﬁxed-rate error correction schemes for rates 0.25, 0.5, and 0.75 is also shown in Figures 6(a) and 7. Fixed-rate codes enjoy small overheads, which are calculated from the design rate (1 to 0.25), header length (5 B), and block size (8–16 B), when matched to the actual loss rate. However, as these schemes are designed to correct only a ﬁxed fraction of errors, PDR drops precipitously when the loss rate exceeds their design rate. The overheads of RTOC and ﬁxed-rate codes are the result of both the coding rate and fragmentation. Fragmentation alone incurs substantial overhead depending on the block size. For original message payload length P , header length H, block size B, ﬁxed coding rate R or number of check blocks transmitted c, the overheads are: P · (H + B) c · (H + B) Fixed-rate = B·R , Online code = (4) H +P H +P Short fragments give high overheads, but may be necessary due to application constraints, and may be less prone to erasure in very poor channels. 5.1

Discussion

Many systems for WSNs must be adaptable at runtime to handle the wide performance range between normal operation and when channels are very poor. The overhead of RTOC is primarily due to: (1) fragmentation of pages or messages into smaller blocks, (2) message expansion from the eﬀective coding rate, and (3) streaming of fragments in a transaction. RTOC allows trade-oﬀs in these key areas to maintain eﬃciency and incur overhead only when necessary for loss resistance. Most mechanisms are automatic and part of the design of protocol, while selection of the block size is exposed as part of its conﬁguration to allow external control by the application. Use of a rateless erasure code overcomes variable channel loss rates automatically with proper parameter selection and integration with the transaction control protocol. Through analysis of online coding’s degree distribution, we chose parameters , δ, and k to achieve stability and good performance in the operating region useful for WSNs. The resulting low coding rate variance reduces memory pressure on already constrained WSN devices. The check block transmit rate control algorithm described uses the estimated loss rate γˆ and bound cmax on check blocks to reduce wasted transmissions from lost termination (ACK) messages. These mechanisms automatically adjust RTOC’s behavior to prioritize message delivery despite poor channel conditions. In our embedded evaluation, the protocol transferred over 95% of the messages successfully despite up to 84% induced channel loss.

172

6

A.D. Wood and J.A. Stankovic

Conclusion

Despite the resource limitations of WSN devices and high channel loss, online coding and RTOC’s synchronization and termination mechanisms provide eﬃcient, reliable data transfer that can serve as a building block for data exﬁltration or code updating. We carefully designed the protocol’s parameters to trade asymptotic optimality for predictability in the WSN operating region, and therefore it imposes modest memory and resource requirements on the system. We presented an evaluation of its implementation on embedded hardware to demonstrate its eﬃciency and performance. Future work may apply our methods to other codes, such as Raptor codes [11], and integrate RTOC with over-the-air reprogramming protocols for high-loss networks.

References 1. Hui, J.W., Culler, D.: The dynamic behavior of a data dissemination protocol for network programming at scale. In: Proc. of SenSys, Baltimore, MD, pp. 81–94 (2004) 2. Paek, J., Govindan, R.: RCRT: rate-controlled reliable transport for wireless sensor networks. In: Proc. of SenSys, Sydney, Australia, pp. 305–319 (2007) 3. Kim, S., Fonseca, R., Dutta, P., Tavakoli, A., Culler, D., Levis, P., Shenker, S., Stoica, I.: Flush: a reliable bulk transport protocol for multihop wireless networks. In: Proc. of SenSys, Sydney, Australia, pp. 351–365 (2007) 4. Kumar, A.: Comparative performance analysis of versions of TCP in a local network with a lossy link. IEEE/ACM TON 6(4), 485–498 (1998) 5. Hagedorn, A., Starobinski, D., Trachtenberg, A.: Rateless Deluge: Over-the-air programming of wireless sensor networks using random linear codes. In: Proc. of IPSN, St. Louis, MO, pp. 457–466 (2008) 6. Rossi, M., Zanca, G., Stabellini, L., Crepaldi, R., Harris, A., Zorzi, M.: SYNAPSE: A network reprogramming protocol for wireless sensor networks using fountain codes. In: Proc. of SECON, San Francisco, CA, pp. 188–196 (2008) 7. Luby, M.: LT codes. In: Proc. of IEEE Symposium on Foundations of Computer Science, pp. 271–280 (2002) 8. Maymounkov, P.: Online codes. Technical report, New York University Technical Report (2002) 9. Heidemann, J., Silva, F., Estrin, D.: Matching data dissemination algorithms to application requirements. In: Proc. of SenSys, Los Angeles, CA, pp. 218–229 (2003) 10. Levis, P., Patel, N., Culler, D., Shenker, S.: Trickle: a self-regulating algorithm for code propagation and maintenance in wireless sensor networks. In: Proc. of NSDI, San Francisco, CA, pp. 15–28 (2004) 11. Shokrollahi, A.: Raptor codes. IEEE Transactions on Information Theory 52, 2551– 2567 (2006) 12. Maymounkov, P., Mazi`eres, D.: Rateless codes and big downloads. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735. Springer, Heidelberg (2003)

Compressed RF Tomography for Wireless Sensor Networks: Centralized and Decentralized Approaches Mohammad A. Kanso and Michael G. Rabbat Department of Electrical and Computer Engineering McGill University Montreal, Quebec, Canada [email protected], [email protected]

Abstract. Radio Frequency (RF) tomography refers to the process of inferring information about an environment by capturing and analyzing RF signals transmitted between nodes in a wireless sensor network. In the case where few available measurements are available, the inference techniques applied in previous work may not be feasible. Under certain assumptions, compressed sensing techniques can accurately infer environment characteristics even from a small set of measurements. This paper introduces Compressed RF Tomography, an approach that combines RF tomography and compressed sensing for monitoring in a wireless sensor network. We also present decentralized techniques which allow monitoring and data analysis to be performed cooperatively by the nodes. The simplicity of our approach makes it attractive for sensor networks. Experiments with simulated and real data demonstrate the capabilities of the approach in both centralized and decentralized scenarios.

1 Introduction Security and safety personnel need intelligent infrastructure to monitor environments for detecting and locating assets. Tracking assets includes being able to locate humans as well as obstructions. Imagine a situation where a disaster has occurred, and some obstructions may have blocked certain paths to the safety exit. The ability to detect the location of these objects in a timely and efficient manner allows quick response from security personnel directing evacuation. This paper provides a feasible and efficient approach to monitoring and surveillance using wireless sensor nodes. RF tomography is applied to analyze the characteristics of the environment. RF tomography is the process of inferring characteristics about a medium by analyzing wireless RF signals that traverse that medium. A wireless signal propagating along a path between a pair of sensors without obstructions loses average power with distance according to [1]: d P¯ (d) = Pt − P0 − 10np log10 dBm, (1) d0 where P¯ (d) is the average received power at distance d from the transmitting sensor, Pt is the transmitted power, P0 is the received power at a reference distance d0 , and np is B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 173–186, 2009. c Springer-Verlag Berlin Heidelberg 2009

174

M.A. Kanso and M.G. Rabbat

the path loss exponent which controls how fast power is lost along a path. For instance, np ≈ 2 for free space propagation, and varies with different environments. Received power on a wireless link between nodes i and j can generally be modeled as in [1]: Pij = P¯ (d) − Zij

(2)

Zij = Xij + Yij

(3)

where Zij is the fading loss consisting of shadowing loss Xij and non-shadowing loss Yij . Thus, the signal attenuation Zij on a link allows us to determine whether or not an obstruction lies on its path. RSS (received signal strength) measurements among links provide means for reconstructing shadowing losses. Wireless signals traversing different obstructions undergo different levels of signal attenuation, depending on the obstruction’s nature and composition (e.g. thick walls attenuate signals more than humans). As more measurement links are available, analyzing those links allows us to infer information about objects’ locations and properties. As more links cross over the same object, more information is available to reach a solution. Essentially, this information will be used to reconstruct a map of power attenuation levels throughout the environment. Patwari and Agrawal introduce the concept of RF tomography for sensor networks in [1]. They propose a centralized reconstruction method based on weighted least squares estimation. This paper introduces compressed RF tomography, leading to an 1 penalized reconstruction criterion, and we propose decentralized schemes for simultaneously carrying our measurements and reconstruction. After going through a formal problem statement in Section 2, we introduce compressed RF tomography in section 3. Experiments using simulated and real data are reported in Section 4. Section 5 describes two decentralized reconstruction approaches which are then compared via simulation in Section 6, and we conclude in section 7.

2 Problem Formulation Assume that sensor nodes are deployed according to Figure 1(a) around the perimeter of a region to be monitored. Each line in Figure 1(a) corresponds to a wireless link. The monitored region is divided into a grid of pixels p ∈ Rn . Each pixel’s value reflects the amount of signal attenuation over its area. Once this information is available, it can be displayed in grayscale, where a darker intensity corresponds to more attenuation. We shall assume that each pixel has a constant attenuation loss over its region. Also, we let the shadowing loss over each lean be denoted by v ∈ Rk . The total shadowing loss of link i, represented as vi , is modeled as a weighted sum over the pixels crossed by this link plus noise by this link in addition to noise. The attenuation over each link in the network can be expressed in matrix form as follows, v = Ap + n,

(4)

where n is Gaussian noise (in dB) with variance σn2 , and the entries of A are defined by do √ ij if link i traverses pixel j di Aij = (5) 0 otherwise

Compressed RF Tomography for Wireless Sensor Networks

(a)

175

(b)

Fig. 1. Figure (a) shows a wireless sensor network in an RF Tomographic surveillance scenario. Figure (b) displays a single link passing through a set of pixels.

where di is the length of link i and doij is the overlap distance covered by link i through √ pixel j. The division by di parallels the adopted shadowing model [2]. The number of rows in A is equal to the number of existing links, and the number of columns is the number of pixels. To be able to monitor the environment, we must acquire a set of measurements and perform analysis on those measurements. A simple centralized algorithm can now be described as proposed in [1]: 1. 1. Nodes acquire signal strength measurements and forward them to the central server. 2. The server computes the power difference on each link P¯i,j − Pi,j and stores the results in a vector v. 3. The server reconstructs the vector pˆ to find the attenuation level over each pixel. The reconstruction approach described in [1] for recovering p from v involves solving a simple weighted least square (WLS) estimator, which is efficient to implement on sensor nodes. However, least squares methods usually require an overdetermined system of equations to provide acceptable results. If p is sparse enough and only a few measurements are available, 1 reconstruction techniques provide an attractive solution. Henceforth, we adopt this approach and investigate its performance next.

3 Compressed RF Tomography As mentioned earlier, the approach we propose in this paper involves compressed sensing of RSS measurements to discover characteristics of the medium. Compressed Sensing (CS) [3] [4] is a modern and novel approach for recovering signals with sparse representations from a set of measurements. In conventional Shannon/Nyquist sampling theorems, a bandlimited signal needs to be sampled at double its bandwidth for perfect

176

M.A. Kanso and M.G. Rabbat

reconstruction. Compressed sensing shows that undersampling a sparse signal at a rate well below its Nyquist rate may allow perfect recovery of all signal components under certain conditions. Due to the assumption of few changes in our environment (i.e. sparse p), compressed sensing applies naturally. This assumption may hold, for example, in border monitoring or nighttime security monitoring at a bank. In these examples few changes are expected at any given time, which means that only a few pixels will contain significant attenuation levels. With this in mind, compressed sensing can be combined with RF Tomography to enable monitoring with fewer required measurements. An m-sparse signal is a signal that contains maximum of m nonzero elements. A typical signal of length n and at most m non-zero components (m n) requires iterating all elements of the signal elements to determine the few nonzero components. The challenging aspect is the recovery of the original signal from the set of measurements. In general, this type of recovery is possible under certain conditions on the measurement matrix [3]. This reconstruction occurs with a success probability which increases as the sparsity level in the signal increases. The prior knowledge of the sparsity of a vector p allows us to reconstruct it from another vector v of k measurements, by solving an optimization problem, pˆ = arg min ||p||0 subject to v = Ap,

(6)

p

where A is defined above, and ||p||0 is defined as the number of nonzero elements in p. Unfortunately, equation (6) is a non-convex NP-hard optimization problem to solve and n is computationally intractable. Reconstructing the signal requires iterating through m sparse subspaces [5]. Researchers [3,6,4] have shown that an easier and equivalent problem to (6) can be solved: pˆ = arg min ||p||1 subject to v = Ap

(7)

p

where ||p||1 is now the 1 norm of p defined as ||p||1 = ni=1 |pi |. The optimization problem in (7) is a convex optimization problem, for which there are numerous algorithms to compute the solution [3,7]. Among the first used solutions was linear programming, also referred to as Basis Pursuit [3], which requires O{m log(n)} measurements to reconstruct an m-sparse signal. In practical applications, measured signals are perturbed by noise as in (4). In this situation, (7) becomes in appropriate for estimating p since the solution should take the perturbation into account. The Least Absolute Shrinkage and Selection Operator (LASSO) [8,9] is a popular sparse estimation technique which solves 1 pˆ = arg min λp1 + v − Ap22 , p 2

(8)

where λ regulates the tradeoff between sparsity and signal intensity. Note that this method requires no prior knowledge of the noise power in the measurements. Alternatively, iterative greedy algorithms such as Orthogonal Matching Pursuit (OMP)exist [10]. OMP is characterized for being more practical for implementation and faster than 1 -minimization approaches. The tradeoff is the extra number of measurements needed and less robustness to noise in data. A detailed description of the

Compressed RF Tomography for Wireless Sensor Networks

177

algorithm can be found in [10]. OMP is particularly attractive for sensor network applications since it is computationally simple to implement. However, LASSO can still be a feasible solution, especially when reconstruction only happens on a more powerful receiver. For this, we choose to compare the performance of both centralized techniques in our simulations and show their tradeoffs.

4 Simulations and Results: Centralized Reconstruction This section presents an evaluation of Compressed RF Tomography. We present results from computerized simulations as well as some results from real sensor data. The primary focus is on the accuracy of results obtained by a compressed set of measurements. Accuracy in this case is measured in terms of mean squared error of the reconstructed signal. For better visibility, the recovered values in pˆ from the reconstruction technique are mapped onto a vector p˜ whose values are in [0,1]. Mapping can be a simple linear transformation onto [0,1], or by a nonlinear transformation as in [1] for better contrast. This allows an easy representation of p˜ on a grayscale as in Figure 2. The area under simulation is a square area surrounded by 20 sensor nodes, transmitting to each other. Each node exchanges information one way with 15 other nodes, as shown in Figure 1(a). This yields a total of 20×15 = 150 possible links. Figure 2 illus2 trates how our approach can monitor an environment with 30 links at difference noise

(a) A monitored area with few obstructions discovered (σn2 =0.01 dB 2 )

(b) A monitored area with few obstructions (σn2 =0.49 dB 2 ) Fig. 2. Simulated environment under surveillance showing the discovered obstructions

178

M.A. Kanso and M.G. Rabbat

Fig. 3. Performance Comparison of LASSO and OMP with 15 measurements (total=150)

levels. The figure shows dark pixels at 4 different positions, each corresponding to an existing obstructions at its location. Next we turn our attention to examining the effect of noisy measurements on the performance of the design. To have a better insight into the amount of error caused by noise, the accuracy of the system is compared to the very low noise (effectively noiseless) case. Accuracy is measured by the mean squared error (MSE). We try to monitor the same obstructions as in Figure 2, with noise also added to measurements to examine its effect on accuracy. Performance results are plotted in Figure 3. As the figure shows, noise level is reflected on the accuracy of measurements. High noise levels cause inaccuracies in measurements and hence higher MSEs. At lower noise levels, accurate monitoring is realized, even with few measurements. Note that at low noise levels , the MSE becomes more affected by the few number of measurements available (only 30 in this case). Comparing the performance of LASSO and OMP techniques, it is obvious that LASSO performs better, especially when noise levels are high. Results show that OMP and LASSO techniques behave similar results with low noise, which leads to favoring the less complex OMP at such conditions. Even at high noise levels, one can still monitor some of the obstructions. Compressed RF tomography is well-suited to the case where measurements are available. This scenario can occur when some nodes are put to sleep to save battery power, or when some of the sensors malfunction, or even when links are dropped. To demonstrate the power of the reconstruction algorithm, we simulate the same obstruction scenario in Figure 2 with a varying number of links used (out of the total 150 link measurements). Figure 4 shows how the MSE varies as more links/nodes measurements are added into the system at a fixed noise level. As the figure shows, LASSO performs better than OMP when few measurements are available. OMP and LASSO provide identical results as soon as roughly 25% of the link measurements are available. Simulations show that OMP requires more measurements than LASSO to obtain the same level of accuracy. We also experimented with our approach on data used in [11]. In Figure 5 below, we compare our reconstruction approach using a small subset of measurements, to the approach that uses all measurements [1]. Figure 5 also demonstrates that compressed RF tomography can accurately monitor an environment if sparsity in the medium is

Compressed RF Tomography for Wireless Sensor Networks

179

Fig. 4. Performance Comparison of LASSO and OMP with a varying number of measurements (noise variance at 0.16 dB2 and 150 links in total)

(a)

(b)

Fig. 5. Testing compressed RF tomography on real sensor data using least square reconstruction approach with all links in (a) and our compressed approach with 15 links in (b)

satisfied. Observe that 1 -minimization in 5(b) removed inaccuracies present in 5(a) due to noise and non-line of sight components.

5 Decentralized Reconstruction Techniques Thus far, we have considered a centralized approach to the reconstruction problem. Wireless nodes continuously transmit their data to a fusion center which handles data processing and analysis. In this section, we consider decentralized and in-network processing to achieve the same (or almost) performance levels as in a centralized fashion. While different tasks can be distributed in a sensor network, our focus in this work is to efficiently solve the following optimization problem: pˆ = arg min ||Ap − v||22 + λ||p||1 p

(9)

180

M.A. Kanso and M.G. Rabbat

Distributed compressed sensing in sensor networks has been investigated in previous works [12,13]. However the distributed aspect was in the joint sparsity of the signal. Our concern in this paper is a distributed reconstruction mechanism. In this section, we attempt to tailor certain optimization techniques to solving a compressed sensing problem cooperatively in a sensor network. Solving optimization problems in a distributed fashion in sensor networks has been investigated due to its benefits over a centralized approach [14,15]. A fusion center in a centralized system constitutes a single point of failure, as is required to posses more powerful abilities than the sensor nodes to handle processing of signal measurements among nodes in the network. Wireless link failures can also heavily degrade a centralized system’s performance, as less information gets through to the fusion center. Some of these nodes may be distant from the server, which essentially requires them to spend more energy for communication (Power ∝ 1/distance2 ), thus reducing the lifetime of the network. Distributed algorithms on the other hand do not suffer from these problems. Processing is performed cooperatively among the nodes, thus distributing the workload equally over all active nodes. Even if certain nodes malfunction, monitoring can continue with the remaining functional nodes. In this work, we introduce CS reconstruction techniques using two different approaches, incremental subgradient methods and projection onto convex sets (POCS). We also differentiate between deterministic and randomized approaches of implementing each method. A detailed discussion of how these methods apply in our case will be our next concern, along with performance results for comparison. 5.1 Incremental Subgradient Optimization Gradient methods are well known techniques used in convex optimization problems. One of their advantages is their simplicity, a property well suited for a wireless sensor network. However, minimizing a convex function via a gradient method requires the function be differentiable. Subgradient methods generalize standard gradient descent methods for non-differentiable functions. Concepts of subgradient and gradient methods share some similar properties. For a convex and nondifferentiable function f : Rn → R, the following inequality holds at any point p0 : f (p) ≥ f (p0 ) + (p − p0 )T g ∀p0 ∈ Rn , (10) where the g ∈ Rn is a subgradient of f . The set of all subgradients of f (p) at any point p is called the subdifferential of f at p, denoted ∂f (p). Note that when f is differentiable, ∂f (p) = ∇f (p), i.e. then the gradient becomes the only possible subgradient. Incremental subgradient methods, originally introduced by [16], split the cost function into smaller functions. The algorithm works iteratively over a set of constraints by sequentially taking steps along the subgradients of those cost functions. For the special case of a sensor network environment, the incremental process iterates through measurements acquired at each node to converge to the solution at all nodes. For distributing the optimization task among sensor nodes, the cost function in (9) is split into smaller component functions. Assuming there are N sensor nodes in total

Compressed RF Tomography for Wireless Sensor Networks

181

in the network, those nodes gather measurements not necessarily uniformly distributed. Our problem can now be written as pˆ = arg min ||Ap − v||22 + λ||p||1 p

= arg min p

N

((Ap)j − vj )2 +

i=1 j∈Mi

fi (p)

λ ||p||1 N

(11)

where Mi is the number of RSS measurements acquired by node i. In each cycle, all nodes iteratively change p in a sequence of subiterations. The update equation in a decentralized subgradient approach now becomes p(c+1) = p(c) + μgi (p(c) ),

(12)

where μ is a step size, c is an iteration number, gi (p(c) ) is the subgradient of fi (p) at p(c) at node i. Rates of convergence have been analyzed in detail by Nedi´c and Bertsekas [17]. They show that under certain conditions the algorithm is guaranteed to converge to an optimal value. However convergence results depend on the approach in choosing the step size μ as well as whether iterations are performed deterministically (round-robin fashion for instance) or randomly. In a deterministic approach, nodes perform updates in a certain cycle. On the other hand, the updating node in a randomized approach is chosen in a uniformly distributed fashion, saving the requirement to implement a cycle. Assuming that each sensor i acquires a set of measurements vi via its sensing matrix Ai , the subgradient that each node uses in its update equation can be expressed as ⎧ (2ATi (Ai p − vi ))w + ⎪ ⎪ ⎪ ⎨(2AT (A p − v )) + i i w i gi (pw ) = T ⎪ (2Ai (Ai p − vi ))w − ⎪ ⎪ ⎩ 0,

λ N λ N λ N

sgn(pw ), , ,

pw = 0 λ pw = 0, (2ATi (Ai p − vi ))w < − N λ pw = 0, (2ATi (Ai p − vi ))w > N otherwise, (13)

where sgn(·) is the sign function, and (x)w is element w of vector x. 5.2 Projection on Convex Sets (POCS) Method In addition to the incremental subgradient algorithm discussed earlier, we propose a distributed POCS method suited for a sensor network environment. One important drawback of subgradient algorithms is that they might converge to local optima or saddle points, and can suffer from slow convergence if step sizes are not properly set. The rate of convergence is more relevant to our setup. As simulations will demonstrate, POCS provides a feasible solution to this problem, with an additional price of complexity. The basic idea of POCS is that data is projected iteratively on the set of constraints. Perhaps one interesting benefit of this method is that it allows adding more constraints to the optimization problem without significantly changing the algorithm. Furthermore, POCS is known to converge much faster than incremental subgradient algorithms. POCS has

182

M.A. Kanso and M.G. Rabbat

been used in the area of image processing [18]. In the area of compressed sensing, POCS methods were employed for data reconstruction [6] but not in a distributed fashion. Let B be 1 ball such that B = {p ∈ Rn |||p||1 ≤ ||p∗ ||1 },

(14)

and, let H be the hyperplane such that H = {p ∈ Rn |Ap = v}.

(15)

Reconstructed data is required to explain the observations v and possess sparse features. Sets H and B attempt to enforce these requirements. Since both sets are convex, the algorithm performs projections on H and B in an alternate fashion. Perhaps one of the challenges is projecting on H since it requires solving arg min ||Ap − v||.

(16)

p

Fortunately, we know this is a simple optimization problem and can be expressed in a compact form via the pseudoinverse (Moore-Penrose inverse). Since each sensor i acquires a set of measurements vi via a sensing matrix Ai , then the POCS algorithm can be iteratively run for every node. In other words, the hyperplane H now becomes the union of hyperplanes Hi = {p ∈ Rn |Ap = v}. Each node performs an alternate projection on B and Hi and broadcasts the result in the sensor network. The projection on a hyperplane Hi can be expressed as projHi (x) = x + A+ i (vi − Ai x),

(17)

where A+ i is the pseudo-inverse of Ai . Note that the hyperplane projection step in [6] involved the inverse of (AAT )−1 instead of a pseudo-inverse. However, the sensing matrices used throughout yield uninvertible matrices (AAT )−1 , so naturally we decided to use the pseudo-inverse. Finding the pseudo-inverse requires performing the singular value decomposition (SVD) of the matrix. The projection on B is essentially a softthresholding step, and is expressed as ⎧ ⎪ if x > λ ⎨x − λ projB (x) = x + λ (18) if x < −λ ⎪ ⎩ 0 otherwise

5.3 Centralized and Decentralized Tradeoffs The formulation of our decentralized approach in (11) has the attractive property that it can be run in parallel among the nodes. Since the objective function is expressed as a sum of separate components, each node can independently work on a component. However, each node must have an updated value for p on each iteration. A decentralized implementation would involve a node performing an update on p, using incremental

Compressed RF Tomography for Wireless Sensor Networks

183

subgradient or POCS techniques, and then broadcasting this new value to all the other nodes. Note that gathering RSS measurements and broadcasting p can be done at the same time, saving battery power. No other communication is required, since each node acquires its own measurements in v and has its own fixed entries in matrix A. So, the communication overhead is acceptable in a wireless sensor network. In a network of N total sensor nodes, a single iteration in a centralized scheme is equivalent to N (or an average of N in a randomized setting) iterations in a decentralized scheme. The centralized approach involves transmitting O(k) values, for k RSS measurements in the network. But compressed sensing theory indicates that k = O(m log n), where m is the number of nonzero elements in p. This means that centralized communication involves transmitting O(m log n) values per iteration. In a decentralized setting, a single iteration involves each node sending an updated version of p. At most O(N n) values are transmitted, where n is the dimension of p (generally N < n). But since p is a sparse vector, basic data compression methods can decrease packet sizes to O(N m). Also, since the application of RF tomography requires all nodes to communicate with each other, then no extra routing costs are required to broadcast p throughout the network. Comparing O(m log n) to O(N m) and observing that generally log n < N , shows that more communication is required in decentralized approaches. Interestingly, one can notice that if n is large enough (large number of pixels), decentralized processing would require less communication than centralized processing. Nevertheless, nodes will still spend more battery power to perform iterative updates on p. These local computations consist of simple matrix operations as described earlier. From an energy point of view, a centralized approach will generally provide fewer communication overhead, longer network lifetime, and faster processing since all information is gathered at the beginning of the first iteration. However, for practicality reasons, a decentralized scheme provides more robustness to server and link failures. An optimal approach would be a combination of both centralized and decentralized techniques in a hybrid architecture, to exploit the advantages of each technique simultaneously.

6 Simulations and Discussion: The Decentralized Approach Using the same environment in Figure 2, we simulate our distributed algorithms for compressed RF tomographic imaging. Since there is no prior information about the monitored environment, the algorithms are initialized with zero data. One hidden advantage in the algorithms proposed is that they can be run in warm start mode, continuing from results of previous iterations. So if there is no significant motion in the environment, one can expect faster convergence rates. Incremental subgradient and POCS methods are tested in both deterministic and randomized settings. On each iteration, a random node updates its results and broadcasts them to the other nodes in the network. We simulate the environment with noise level at 0.0025 dB2 , 30 available links for 200 iterations. Moreover, we used a step size of 0.3 for our subgradient approach. Figure 6 demonstrates that within 2 cycles (40 iterations) the reconstructed data becomes close to its optimal value. Notice that the POCS method performs considerably

184

M.A. Kanso and M.G. Rabbat

(a) Cost function versus number of iterations

(b) MSE versus number of iterations Fig. 6. Comparing our decentralized approaches by varying the number of iterations

better than the incremental approach. This is mainly due to the constant step size assumed. Ideally, an adaptive step size should be employed. Deterministic approaches perform better than the randomized approaches, especially during the first iterations. This is expected since a deterministic approach guarantees that nodes perform iterations in a certain order, however in a randomized approach some nodes might perform updates to the solution more frequently than others.

7 Conclusions and Future Work In this paper we have introduced the idea compressed sensing into RF tomographic imaging and have proposed models for centralized and decentralized processing. Benefits of our approach have been explored, along with an overview of the theory and simulations. The combination of compressed sensing and RF tomography produces an energy efficient approach for monitoring of environments. RF tomography by itself is a cheap approach for monitoring, since it relies on simple RSS measurements and basic data analysis. Extending the lifetime in a wireless sensor network while keeping

Compressed RF Tomography for Wireless Sensor Networks

185

reliable performance is a challenge by itself [19]. Network lifetime can be especially important in cases of unexpected power outages. Compressed RF Tomography targets efficiency and energy saving through minimizing measurements and the number of active nodes inside a network. Moreover, since few measurements can be as informative as more measurements, some fault tolerance aspects exist in the network. Finally, the decentralized scheme allows nodes to cooperatively analyze data without the need of a bottleneck fusion center. Simulations have supported the validity of the design, and provided a comparison between iterative and 1 -minimization on one hand, and centralized and decentralized techniques on the other hand. These techniques have shown the tradeoff between performance and simplicity of implementation. Performance of the design was examined through investigating the effects of noise and number of available measurement links. Furthermore, the incremental subgradient and POCS methods have demonstrated their validity and tradeoffs through simulations. Our future direction in this area involves investigating the benefits of exploiting prior information about the environment to choose an optimal set of measurements. We are also aiming at exploring other optimization techniques that can be applied in a distributed fashion. Moreover, we hope to generalize our design to more complicated environments and sensor node deployments, in which an optimal positioning scheme is to be found.

Acknowledgements We thank N. Patwari and J. Wilson from the University of Utah for sharing their sensor network measurements. We also gratefully acknowledge support from NSERC Discovery grant 341596-2007 and FQRNT Nouveaux Chercheurs grant NC-126057.

References 1. Patwari, N., Agrawal, P.: Effects of correlated shadowing: Connectivity, localization, and RF tomography, 82–93 (April 2008) 2. Patwari, N., Agrawal, P.: Nesh: A joint shadowing model for links in a multi-hop network. In: Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2873–2876 (2008) 3. Donoho, D.: Compressed sensing. IEEE Trans. on Info. Theory 52(4), 1289–1306 (2006) 4. Candes, E., Romberg, J., Tao, T.: Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. on Information Theory 52(2), 489–509 (2006) 5. Candes, E., Tao, T.: Decoding by linear programming. IEEE Transactions on Information Theory 51(12), 4203–4215 (2005) 6. Candes, E.J., Tao, T.: Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. on Information Theory 52(12), 5406–5425 (2006) 7. Figueiredo, M., Nowak, R.D., Wright, S.: Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing 1(4), 586–597 (2007) 8. Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)

186

M.A. Kanso and M.G. Rabbat

9. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Annals of Statistics 32(2), 407–499 (2004) 10. Tropp, J., Gilbert, A.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory 53(12), 4655–4666 (2007) 11. Wilson, J., Patwari, N.: Radio tomographic imaging with wireless networks. Technical report, University of Utah (2008) 12. Duarte, M., Sarvotham, S., Baron, D., Wakin, M., Baraniuk, R.: Distributed compressed sensing of jointly sparse signals. In: Thirty-Ninth Asilomar Conference on Signals, Systems and Computers (November 2005) 13. Haupt, J., Bajwa, W., Rabbat, M., Nowak, R.: Compressed sensing for networked data. IEEE Signal Processing Magazine 25(2), 92–101 (2008) 14. Rabbat, M., Nowak, R.: Distributed optimization in sensor networks. In: Third International Symposium on Information Processing in Sensor Networks (IPSN), pp., 20–27 (April 2004) 15. Johansson, B.: On distributed optimization in networked systems. PhD Thesis, Royal Institute of Technology, KTH (2008) 16. Kibardin, V.M.: Decomposition into functions in the minimization problem. Automation and Remote Control 40(1), 109–138 (1980) 17. Nedic, A., Bertsekas, D.: Convergence Rate of Incremental Subgradient Algorithms. In: Stochastic Optimization: Algorithms and Applications. Kluwer Academic Publishers, Dordrecht (2000) 18. Gubin, L.G., Polyak, B.T., Raik, E.V.: The method of projections for finding the common point of convex sets. USSR Computational Mathematics and Mathematical Physics 7(6) (1967) 19. Akyildiz, I., Su, W., Sankarasubramaniam, Y., Cayirci, E.: A survey on sensor networks. IEEE Communications Magazine 40(8), 102–114 (2002)

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements Suhinthan Maheswararajah1, Siddeswara Mayura Guru2 , Yanfeng Shu2 , and Saman Halgamuge1 1

2

University of Melbourne, Parkville, Vic 3010, Australia [email protected], [email protected] CSIRO Tasmanian ICT Centre, GPO Box 1538, Hobart 7001, Australia [email protected], [email protected]

Abstract. In wireless sensor network applications, sensor measurements are corrupted by noises resulting from harsh environmental conditions, hardware and transmission errors. Minimising the impact of noise in an energy constrained sensor network is a challenging task. We study the problem of estimating environmental phenomena (e.g., temperature, humidity, pressure) based on noisy sensor measurements to minimise the estimation error. An environmental phenomenon is modeled using linear Gaussian dynamics and the Kalman ﬁltering technique is used for the estimation. At each time step, a group of sensors is scheduled to transmit data to the base station to minimise the total estimated error for a given energy budget. The sensor scheduling problem is solved by dynamic programming and one-step-look-ahead methods. Simulation results are presented to evaluate the performance of both methods. The dynamic programming method produced better results with higher computational cost than the one-step-look-ahead method.

1

Introduction

In real-world applications, it is impossible to capture error-free measurements from the environment. Environmental conditions such as strong temporal and spatial variation of temperature, pressure, electromagnetic noise and radiation interfere with sensor measurements resulting in an imprecise observation [1]. Low-cost sensors, which are typically deployed in wireless sensor networks, further worsen the situation, with more errors added due to the error-bound hardware used for sensing and transmitting [2]. These sensors are also subject to greater drift than higher quality, more expensive sensors as a result of internal stabilisation [3]. Errors associated with sensor measurements can be classiﬁed as systematic and random error [4]. Systematic errors are predictable, caused due to environmental conditions and can be corrected by calibration [5]. However, random errors are inherent and unavoidable. Some of the causes of it are hardware noise and inaccurate measurement techniques. Filtering techniques are used to alleviate random error. We maintain that any error associated with an observation cannot be ignored if a critical decision needs to be made. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 187–200, 2009. c Springer-Verlag Berlin Heidelberg 2009

188

S. Maheswararajah et al.

This work is motivated by our experiences with developing a wireless sensor network application for water use eﬃciency in irrigation [6]. The aim of this application is to develop sensor technology to irrigate the ﬁeld only when it is necessary. As decisions for irrigation will be made mainly based on various sensor measurements gathered from the ﬁeld, errors in measurements will lead to inappropriate actuation of the irrigation system. Therefore, we should minimise the impact of measurement errors on decision making as much as possible. There are several instances where an environmental phenomenon is represented as a linear Gaussian stochastic model. The daily average surface temperature was modeled using a spatial-temporal linear Gaussian stochastic approach [7]. This model was validated using real data captured from diﬀerent geographical areas and found to ﬁt well. In [2], a physical phenomenon of an indoor environment and sensor measurements were again modeled using a linear Gaussian stochastic approach and validated using the publicly available Intel Lab Dataset [8]. It was found that the Kalman ﬁltering gives better state estimation results than the linear regression method. The same model and technique were also used in a target tracking application [9]. The target was modeled as a linear Gaussian system and estimation was performed using Kalman ﬁltering. In this paper, we also assume that environmental phenomena vary linearly over time with white Gaussian noise. Moreover, we assume that sensor nodes are free from drifting but sensor measurements are corrupted by white Gaussian noise and linearly related to the phenomena. In this paper, we study the problem of selecting an optimal sequence of sensor groups which minimises the total estimation error for a given energy budget 1 in an energy-constrained wireless sensor network. Each sensor group consists of a certain number of sensor nodes and each node can have diﬀerent error covariance. Selecting all sensor groups may produce better estimation but dissipate more energy due to large volume of data transmission. Therefore, it is necessary to select an optimal sensor group at each time step in order to minimise the estimation errors while considering the energy budget. Kalman ﬁltering technique allows us to calculate the estimation error before obtaining any measurements [10]. Therefore, it is possible to calculate the total estimated error associated with a given sequence of sensor groups oﬀ-line. This helps us to ﬁnd the best sensor group scheduling sequence in advance. We consider this problem as a combinatorial optimisation and solve it using dynamic programming. We also use a sub-optimal method, one-step-look-ahead, to produce a result in a short duration. The remainder of this paper is organised as follow: section 2 brieﬂy describes the problem description and formulation. In section 3, the phenomena of an area and sensor measurements are modeled as linear Gaussian dynamics and the Kalman ﬁltering technique is used to estimate the state of the phenomena. Section 4 presents dynamic programming and one-step-look-ahead methods to ﬁnd the sequence of sensor groups to minimise the estimated error of the phenomena whilst satisfying the energy budget. Section 5 gives simulation results. The paper is concluded in section 6. 1

Energy budget is the initial energy available in a network.

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

2

189

Problem Formulation

Consider a scenario where a sensor network consists of battery-powered sensor nodes deployed to observe physical phenomena of an environment. Sensor nodes produce measurements and transmit them to the Base Station (BS) by a single-hop communication. As the measurement inherently contains noise, we need to minimise the error due to noise on the state estimation. The BS can estimate the state (i.e., true value) of the phenomena based on noisy measurements collected from sensor nodes. In practice, noise in measurements varies with diﬀerent phenomena even though the measurements are from the same sensor node. For example, a sensor node may produce a temperature measurement with lower noise but humidity measurement with higher noise. Although it is possible for us to get a better error estimation with all measurements from all sensor nodes in a network at a given time, it is not feasible due to the resultant high energy consumption. Therefore, there should be a trade-oﬀ between error estimation and energy consumption in an energy-constrained wireless sensor network. Thus, the problem we intend to address in this paper is, to select only a subset of sensor nodes at each time step that could minimise the total estimation error while satisfying the energy budget. However, the possible number of sets of sensor nodes available at each time step depends on the size of a network. Let a sensor network have m sensor nodes with a maximum of u nodes selected to form a set at a given time. A sensor set could have 1, 2, ... or u sensor nodes and a total of m C1 +m C2 + ......m Cu sensor sets could be formed at each time step. It is a time consuming and complex task to identify the sensor sets and the combination of sensor nodes in a set. Therefore, in this paper, for simplicity, we assume that there is a pre-deﬁned set of sensor groups and we select a sensor group from the set at each time step.

3

System Models

In this section, we present the models which we use to represent environmental phenomena, sensor measurements, and energy dissipation. The phenomena we are observing in a network can be temperature, humidity, pressure, moisture etc. However, in this study we only consider temperature and humidity. The state of a phenomenon varies with time (but not with space) according to a stochastic equation (1): xk = fk−1 xk−1 + wk−1 , (1)

where, the column vector xk = [tk hk ] 2 represents the state of the phenomena and tk and hk denote the true temperature and humidity of an area at time step k 3 . wk represents a white Gaussian process noise with covariance matrix qk , such that wk ∼ N (0, qk ). According to (1), xk (i.e., true temperature and humidity 2 3

The meaning of the notations used in this section can be found in Table 1. state and true temperature and humidity are used interchangeably in this section.

190

S. Maheswararajah et al. Table 1. Notations used in the system model

Notation k xk fk wk qk n gi sij ni i zkj i hkj ij vk i rkj ψji egi gmin et

Description time step true state of the phenomena transition matrix process noise error covariance of the process noise total number of sensor groups in the network i-th sensor group j-th sensor node in i-th sensor group total number of sensor nodes in i-th sensor group observed measurement of xk from the j-th sensor node in i-th sensor group at k-th time step observation matrix of the j-th sensor node in i-th sensor group at k-th time step measurement noise of the j-th sensor node in i-th sensor group at k-th time step error covariance of the j-th sensor node in i-th sensor group at k-th time step energy consumption of the j-th sensor node in i-th sensor group total energy consumption of the i-th sensor group sensor group which consumes minimum energy total energy budget for t time steps

at time step k) is transited from xk−1 by the transition matrix or system matrix fk−1 with a white Gaussian noise determined by qk−1 . We assume that fk and qk are known a priori. Let n denotes the number of pre-deﬁned sensor groups in a network and gi = si1 , si2 , ..., sini denotes the i-th sensor group. Thus the i-th sensor group has ni sensor nodes such that |gi | = ni . We assume that the sensor measurements are linearly related to xk and corrupted by white Gaussian noise. At time step k, the measurement from the j-th sensor node of the i-th sensor group is a column vector given by: i i i zkj = hkj xk + vkj . (2) i

The noise vkj for each sensor node is assumed to be independent of each other i i i and white Gaussian, vkj ∼ N (0, rkj ). According to (2), zkj (observed, i.e., true + noise) temperature and humidity at time step k) is mapped from xk by the i observation matrix hkj with the white Gaussian noise determined by the error i i i covariance rkj . We assume that rkj and hkj are known for all the sensor nodes. Furthermore, the communication between sensor node and the BS is considered as a single-hop and the power consumption ψji of a sensor node sij during active and sleep modes is given by: i ψ1 + ψtxj if active i ψj = (3) ψ2 if asleep, where ψ1 represents power consumption due to sensing and data processing, (i,j) ψtx represents power consumption due to transmitting data to the BS and the power required for its own timer is denoted by ψ2 . The transmission power i between the sensor node and the BS is calculated as ψtxj = (α1 + α2 d2ij )r, where r denotes the data rate, α1 > 0 a constant related to the radio energy and dij is

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

191

the Euclidean distance between BS and sensor node sij . Therefore, the energy consumption due to activating the i-th sensor group is given by: egi =

ni

ψji .

(4)

j=1

3.1

State Estimation

In linear Gaussian dynamics, the state xk can be estimated recursively using the Kalman ﬁltering technique, which produces the minimum Mean Square Error (MSE) of the estimation [11]. Let ξk|k and pk|k denote the estimated state and its error covariance respectively. They are deﬁned as: ξk|k = E {xk } , pk|k = E (xk − ξk|k )(xk − ξk|k ) . If the BS selects the i-th sensor group at time step k, then the stacked measurement equation is given by: zk = hk xk + vk , (5) in in where zk = zki1 , zki2 , ...., zk i , hk = hik1 , hik2 , ..., hk i and vk ∼ N (0, rk ). For the given stacked measurements zk from the i-th sensor group, the centralised Kalman ﬁlter estimates the state as: ξk|k = ξk|k−1 + kk (zk − hk ξk|k−1 ),

(6)

where kk is the Kalman gain represented as kk = pk|k hk r−1 k and the estimated error covariance pk|k is given by:

−1 −1 p−1 k|k = pk|k−1 + hk rk hk

(7)

where pk|k−1 = fk−1 pk−1|k−1 fk−1 + qk−1 , ξk|k−1 = fk−1 ξk−1|k−1 . Since we assume that sensor nodes noises are cross-independent, the part of (7) can be represented as (8) based on [12].

h

r−1 k h

=

ni

i−1

i

i

hkj rkj hkj

(8)

j=1

We can rewrite the error covariance of the estimated state given in (7) by substituting (8) as: −1 p−1 k|k = pk|k−1 +

ni j=1

i

i−1

i

hkj rkj hkj .

f (gi )

(9)

192

S. Maheswararajah et al. i

Using (9), the accuracy of the estimated state can be calculated for known hkj i and rkj of the i-th sensor group. It can also be inferred that pk|k is a function i i i i of rkj and independent of zkj . Since hkj and rkj are known for each sensor node, pk|k can be calculated independent of observed measurements. Therefore the computation of error covariance can be performed oﬀ-line without the observed measurement. i i Assume that the sensor nodes are identical, then rkj = r∗k and hkj = h∗k . The pk|k for sensor group gi is given by: −1 ∗ p−1 k|k = pk|k−1 + f (gi ) ∗

where, f (gi ) =

(10)

ni h∗k r∗−1 h∗k . k

f ∗ (gi ) → ∞ as ni → ∞ and consequently pk|k → 0. This concludes that a sensor group with a large number of sensor nodes produce small error covariance. However, energy dissipation is high, due to data transmission, leading to signiﬁcant reduction in network lifetime. Therefore, it is not practical to transmit data from all the sensor nodes at each time step to minimise the error covariance. In the next subsection, we deﬁne our objective function with a battery energy constraint. 3.2

Error Cost Function

We are considering an energy-constrained battery-operated sensor network. At each time step, the BS will identify a sensor group to transmit data. The sequence of sensor groups selected for time period t is μ(t) = {u1 , u2 , ..., ut }. Where uk is the selected sensor group at time step k. Let et denotes the total energy budget of a network. The energy constraint is given by: t euk ≤ et . (11) k=1

Since the BS selects a sensor group at each time step, et should satisfy the following condition: et ≥ tegmin , where egmin =

min

i=1,2,...,n

{egi } .

(12)

The sensor group which consumes minimum energy out of n sensor groups is denoted as egmin . If the energy budget does not satisfy the condition as deﬁned in (12), then the BS cannot select a sensor group at each time step. Therefore, we assume that the energy budget always satisﬁes the condition given in (12). We deﬁne the error cost function for the time period {1, 2, ....., t}: J(μ(t)) =

t k=1

trace(puk ),

(13)

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

193

where puk represents the error covariance of the estimated state associated with the selected sensor group uk at time step k. puk is calculated based on (9) and given by: −1 p−1 (14) uk = pk|k−1 + f (uk ) Our objective is to ﬁnd the optimal sequence of sensor groups which has the minimum total error cost J(μ(t)) for the entire time period. The optimal sequence of sensor groups, denoted by μ∗ (t) = {u∗1 , u∗2 , ..., u∗T }, can be computed based on:

t ∗ μ (t) = arg min trace(puk ) (15) ∀uk

subject to

t

k=1

euk ≤ et .

k=1

The BS can select a sequence of sensor groups out of maximum of nt possible sequences. This is a combinatorial optimisation problem and it has a maximum of nt feasible solutions. If t and n are small, then the problem can be solved by an exhaustive search method. Since the number of possible sequences of sensor groups increases exponentially as t and n increase, it is not practical to solve the problem by an exhaustive search method within an acceptable time period. Hence, we propose methods to solve this combinatorial problem within a feasible time period.

4

Sensor Group Scheduling

The dynamic programming technique and the One-Step-Look-Ahead (OSLA) method are used respectively to ﬁnd optimal and sub-optimal sequences of sensor groups to communicate with the BS for a given energy budget. The description of the algorithms is presented below. 4.1

Dynamic Programming

Dynamic programming (DP) is a recursive technique for ﬁnding an optimal solution to a problem that can be broken into several subproblems [13]. Each subproblem is solved optimally and the results are stored for future use. These stored results are used to construct an optimal solution for the original problem. Scheduling energy-constrained sensor nodes to minimise the tracking error was studied in [14], where the energy constraint was relaxed using Lagrangian multipliers and the problem was solved using an approximate dynamic programming technique. In this paper, we use DP to optimally schedule the sensor groups to minimise the error of the estimation for a given energy budget. The original problem is broken into several stages and each stage is divided into many states. In DP, a state stores all information required to go from one stage to the next. In this paper, the stage is considered as the time step and thus

194

S. Maheswararajah et al.

total number of stages are equal to the total number of time steps. The state of the DP consists of two parts: the error covariance of the estimated state denoted as dk , and the available energy budget denoted as lk . The decision at each state is the selected sensor group and denoted as μk (dk , lk ). Since the BS must select a sensor group at each time step, it should be aware of the available energy budget after selecting a sensor group. Therefore, the BS cannot choose a sensor group randomly such that μk (dk , lk ) ∈ nlk . However, nlk ⊆ {g1 , g2 , ..., gn } can be calculated based on emax and emin k k , which denotes the maximum and minimum available energy budget at the k-th stage, as deﬁned below: et if k = 1 max ek = (16) et − (k − 1)egmin if t ≥ k > 1, emin = k

et if k = 1 (t − k + 1)egmin if t ≥ k > 1,

(17)

We can say that the i-th senor group is in nlk , if the remaining energy budget lk − egi at the k-th stage is larger than the minimum available energy budget emin k+1 at k + 1: g i ∈ n lk

if

(lk − egi ) ≥ emin k+1

(18)

The available energy budget lk at each stage can take values between emin and k emax whereas d takes the values of all possible p . If the decision at k-th k k−1|k−1 k stage is made to activate μk (dk , lk ) = uk (i.e, to activate uk ∈ nlk sensor group at time step k) for a given dk and ek , then the cost per stage is deﬁned by: g(dk , lk , uk ) = trace(puk (dk )), (19) −1 where p−1 uk (dk ) = pk|k−1 (dk ) + f (uk ),

pk|k−1 (dk ) = fk−1 dk fk−1 + qk−1 . Backward DP is used to solve our problem. In backward DP, recursion proceeds backward from stage t to 1. The cost function of the DP at the last stage t is deﬁned as: Jt (dt , lt ) = min g(dt , lt , ut ), (20) ut ∈nlt

and the cost function of the DP at the k-th stage is deﬁned as: Jk (dk , lk ) = min {g(dk , lk , uk ) + Jk+1 (puk (dk ), lk − euk )} uk ∈nlk

(21)

DP solves the sub-problems from k = t to 1 and stores the optimal value for Jk (dk , lk ) and μ∗k (dk , lk ). For a given initial error covariance p0|0 of the state and a total energy budget et , the optimal value of our objective function in (13) is equivalent to the initial stage of DP: J1 (d1 , l1 ) = min

uk ∈nlk

t trace(puk ), k=1

where d1 = p0|0 , l1 = et .

(22)

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

195

We only know the exact value of the state of DP for the ﬁrst stage (as d1 = p0|0 and l1 = et ) but not for the rest of the stages. Therefore, DP considers all possible values of the state at each stage. Since the state space is inﬁnite, each element in the state space is discretised. We use interpolation to approximate the value of Jk+1 (puk (dk ), lk − euk ), if puk (dk ) or lk − euk is not available at the k + 1-th stage. The ﬁner discretisation of the state space reduces the approximation provided by the interpolation technique resulting in a solution closer to optimal. However, the ﬁner discretisation leads to a higher number of states which in turn increases the computational cost. In DP, the computational cost depends on the number of states and the number of decision variables |nlk | available at each stage. There will be |nlk | comparisons at each state. For example, if the number of states are t ak at the k-th stage, then DP requires a total of |nl1 | + ak |nlk | comparisons. k=2

DP produces an optimal sequence of sensor groups {u∗1 , u∗2 , ..., u∗t }, where u∗k is given by: ⎧ ∗ if k = 1 ⎨ μ1 (p0|0 , et ) ∗ k−1 uk = (23) eu∗t ) if t ≥ k > 1 ⎩ μ∗k (pk−1|k−1 , et − t=1

4.2

One-Step-Look-Ahead Method

We also present a sub-optimal method based on OSLA for the sensor scheduling problem. The main reason for presenting a sub-optimal method is that, DP may not produce a solution within a short time period for a large problem. Thus we use OSLA here to produce feasible solutions with less computational cost than that of DP. OSLA optimises only the current time step and therefore results are suboptimal. At each time step, the OSLA ﬁnds a sensor group which minimises the current error covariance of the estimated state. OSLA produces a sub-optimal sequence of sensor groups as {˜ u∗1 , u ˜∗2 , ..., u ˜∗t }, where the u ˜∗k is given by: u˜∗k = arg min trace(puk ), (24) uk ∈nlk

where lk = et −

k−1 t=1

eu˜∗t is the available energy budget at time step k. At each

time step, OSLA calculates lk and updates the set nlk . The computational cost of OSLA depends on the number of sensor groups in the set nlk . OSLA needs a t total of |nlk | comparisons since it compares |nlk | times at each time step. k=1

5

Simulation Results

In this section, we evaluate our proposed methods via simulation. We consider a sensor network of 40 nodes deployed in a ﬁeld of 1000 m × 1000m. The BS

196

S. Maheswararajah et al.

1000 C6

C5

900 800

A9

F3 A8

C7

F4

B7

E4

700

meters

A10

D6

B6

D5

B8

E5

A7

600 B5

500

B4

C4

400

A4

E3

300

F1

D1

0

C1 B2 E1

200

B3

A3

B1

A1

100

F2

A2

200

0

D4

A5

C3

D3 A6

C2 E2

D2

400

600

800

1000

meters

Fig. 1. Sensor groups and positions of sensor nodes in the network

is located at the corner ([0,0]) of the network. The sensor nodes are divided into 6 pre-deﬁned groups as shown in Fig. 1 and properties of the sensor groups are given in Table 2. For the purpose of simulation, error covariances of all sensor nodes in a sensor group are assumed to be the same and time-invariant i (i.e., rkj = ri ). However, models and the proposed methods can handle diﬀerent error covariances for each sensor in a network. The error covariances of all sensor groups are given below. We assume for the simulation purpose the humidity measurement has higher noise than that of temperature. 1.2 0.0 1.8 0.0 1.4 0.0 a b c r = , r = , r = , 0.0 3.5 0.0 4.0 0.0 3.8 1.8 0.0 2.0 0.0 4.2 0.0 rd = , re = , rf = . 0.0 4.0 0.0 4.3 0.0 6.5 The sensor model used in the simulation is the same as in (2) and the observation matrix hk is assumed to be a 2 × 2 unit matrix. The parameters α1 and α2 in the energy model of (3) are set to 100 nJ/b and 1 pJ/(bm2 ) respectively. Since ψ1 & ψ1 >> ψtx , we consider only ψtx in our simulation. We use a measurement data size of 1 MB and a data rate of 8 Mbps. The sensor groups transmit measurements to the BS every 0.5 hour for a total time period of 24 hours. Therefore the BS has to schedule the sensor groups for k = 1, 2, ..., 48 time steps, such that t = 48. Energy consumption given in Table 2 is for transmitting 1 MB of data to the BS for the respective sensor groups. The temperature and humidity of the environment vary as in (1) and the system matrix fk is assumed to be a 2 × 2 unit matrix. For this simulation, the values of the process noise covariance qk and the initial error covariance p0|0 are given in (25) and (26) respectively. Also for the purpose of simulation, noise in temperature and humidity are considered to be independent of each other. This is reﬂected in (25) and (26).

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

197

Table 2. Properties of the sensor groups Sensor Group Number of sensor nodes Energy Consumption Error Covariance (gi ) (ni ) (egi ) J (ri ) ga 10 63 ra gb 8 47 rb gc 7 48 rc gd 6 38 rd ge 5 26 re gf 4 25 rf

⎧ 1.0 0 ⎪ ⎪ if ⎪ ⎪ ⎨ 0 5.0

30 ≥ k ≥ 18

qk =

p0|0

⎪ ⎪ 4.0 0 ⎪ ⎪ if ⎩ 0 12.0 0.4 0 = 0 0.45

(25) otherwise (26)

We present the results for diﬀerent energy budgets in Table 3. We also consider a random selection method to schedule the sensor groups. In this method, sensor groups are selected randomly from the set nlk at each time step. Results in Table 3 represent the mean value and standard deviation (SD) of the error cost for 30 independent runs obtained for three proposed methods. Since DP and OSLA are deterministic optimisation methods, the standard deviations are zero for all cases. Fig. 2(a) illustrates the variation of the cumulative error cost of the estimated state for energy budget e48 = 2200 J for a single simulation. Fig. 2(b) shows the cumulative energy consumption of all the Table 3. Cumulative error cost J(μ(48)) of the estimated state obtained by DP, OSLA and random methods Energy Budget (e48 ) J 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200

DP (mean±SD) 50.187 ± 0.000 49.060 ± 0.000 47.897 ± 0.000 46.778 ± 0.000 45.624 ± 0.000 44.548 ± 0.000 43.728 ± 0.000 42.728 ± 0.000 41.618 ± 0.000 40.793 ± 0.000

OSLA (mean±SD) 66.844 ± 0.000 62.320 ± 0.000 59.553 ± 0.000 59.292 ± 0.000 57.940 ± 0.000 55.173 ± 0.000 54.912 ± 0.000 53.560 ± 0.000 50.671 ± 0.000 45.420 ± 0.000

random (mean±SD) 63.770 ± 1.894 61.325 ± 1.935 57.644 ± 1.912 54.616 ± 1.621 51.592 ± 1.741 48.543 ± 1.732 46.583 ± 0.750 45.506 ± 1.223 44.863 ± 1.014 44.842 ± 1.345

198

S. Maheswararajah et al.

2500

DP OSLA random

45

DP OSLA random

40

2000 Total Energy Consumption (J)

Cumulative error cost, J ( μ (48))

50

35 30 25 20 15 10

1500

1000

500

5 0

5

10

15

20 25 30 Time (x 0.5 hours)

35

40

45

(a) Cumulative error of the estimated state

0

5

10

15

20 25 30 Time (x0.5 hours)

35

40

45

(b) Energy consumption of the sensor network

Fig. 2. Variation of cumulative error of the estimated state and energy consumption of the sensor network for the energy budget e48 = 2200 J

methods at each time step. Fig. 3 illustrates the variation of the Root Mean Square Error (RMSE) of the estimated temperature and humidity for the sensor group sequence obtained for the case e48 = 2200 J. It can be seen in Table 3 that DP performs better in all cases as it is the optimal method. We observed that DP produces slightly better results when there is ﬁner discretisation of the state spaces. However, the computational cost is high. In the current simulation for DP, we use 340 states at each stage and the computation time is around 15 minutes for each run whereas less than a minute is needed for the OSLA and random methods. The OSLA is a myopic scheduling technique and therefore, at the initial stages of the simulation it selects sensor group which provide less error as shown in Fig. 2. However, it is unable to use the best sensor groups as determined in the later time steps due to insuﬃcient energy availability in the network. It can also be seen in Fig. 3 that OSLA produces a lower RMSE of the estimated state than DP and the random method until k = 27 but not for the remaining time steps. Since DP optimally solves and stores sensor groups for all possible energy budgets at each time step, it wisely spreads energy usage over time and produces better results for the entire time period. On average the random selection method produces better results than OSLA. Since the selection of sensor groups is purely random, the random method may produce worse results than OSLA in some cases as shown in Fig. 2(a). The standard deviation of the results for the random method indicates the instability of the method. It can be seen in Fig. 2(b) that the DP and OSLA used nearly entire energy budget whereas random technique used lesser energy for the same total time step because the objective of the problem is to minimise the total error cost for a given energy budget but not to minimise the energy consumption.

Energy Adaptive Sensor Scheduling for Noisy Sensor Measurements

199

0.8 0.7 0.6 0.5 0.4 0.3 0.2

1.2

1 0.8

0.6

0.4

0.1 0 −0.1

DP OSLA random

1.4 RMSE of the estimated humidity (%)

RMSE of the estimated temperature (Deg C)

1.6 DP OSLA random

0.9

0.2 5

10

15

20 25 30 Time (x 0.5 hours)

35

40

45

(a) RMSE of the estimated temperature

5

10

15

20 25 30 Time (x 0.5 hours)

35

40

45

(b) RMSE of the estimated humidity

Fig. 3. Variation of the RMSE of the estimated state of the phenomena obtained by DP, OSLA and random techniques for the energy budget e48 = 2200 J

6

Conclusion and Future Work

In this paper, we studied the problem of scheduling sensor groups to estimate environmental phenomena with minimal error for an energy-constrained sensor network. We considered that the state of the phenomena consists of temperature and humidity which linearly vary with Gaussian noise. The sensor measurements are assumed to be linearly related to the state of the phenomena and corrupted by white Gaussian noise. The objective of the base station is to select a sequence of sensor groups to minimise the total error in the estimation for a given energy budget. We solved the problem using dynamic programming technique and OSLA method. Even though dynamic programming gives better results than OSLA, it may be useful when we need a solution within an short time period. In the future, we plan to use the proposed methods on a real wireless sensor network. It is a challenging task to estimate the measurements from the network deployed in [6] due to the high unpredictability of the environment and frequent variation of vegetation.

Acknowledgements This work was conducted when Suhinthan was an intern student at the CSIRO Tasmanian ICT Centre. The authors would like to thank Stephen Guigni, Greg Timms and Paulo De Souza for providing comments on the manuscript. The CSIRO Tasmanian ICT Centre is jointly funded by the Australian Government through the Intelligent Island Program and Australia’s Commonwealth Scientiﬁc and Industrial Research Organisation (CSIRO). The Intelligent Island Program is administered by the Tasmanian Department of Economic Development and Tourism.

200

S. Maheswararajah et al.

References 1. Nakamura, E.F., Loureiro, A.A.F., Frery, A.C.: Information fusion for wireless sensor networks: Methods, models, and classiﬁcations. ACM Comput. Surv. 39(3), 9 (2007) 2. Yee Lin, T., Sehgal, V., Hamid, H.S.: Sensoclean: Handling noisy and incomplete data in sensor networks using modeling. Technical report, University of maryland (2005) 3. Takruri, M., Rajasegarar, S., Challa, S., Leckie, C., Palaniswami, M.: Online drift correction in wireless sensor networks using spatio-temporal modeling. In: 2008 11th International Conference on Information Fusion, pp. 1–8 (30 2008-July 3 2008) 4. Elnahrawy, E., Nath, B.: Cleaning and querying noisy sensors. In: WSNA 2003: Proceedings of the 2nd ACM international conference on Wireless sensor networks and applications, pp. 78–87. ACM, New York (2003) 5. Bychkovskiy, V., Megerian, S., Estrin, D., Potkonjak, M.: A collaborative approach to in-place sensor calibration. In: Proceedings of the Second International Workshop on Information Processing in Sensor Networks IPSN, pp. 301–316 (2003) 6. McCulloch, J., Guru, S.M., McCarthy, P., Hugo, D., Peng, W., Terhorst, A.: Wireless sensor network deployment for water use eﬃciency in irrigation. In: REALWSN 2008: Proceedings of the workshop on Real-world wireless sensor networks, pp. 46– 50. ACM, New York (2008) 7. Benth, J.S., Benth, F.E., Jalinskas, P.: A spatial-temporal model for temperature with seasonal variance. Journal of Applied Statistics 34(7), 823–841 (2007) 8. Data: Intel Lab Dataset, http://db.csail.mit.edu/labdata/labdata.html/ 9. Maheswararajah, M., Halgamuge, S., Premaratne, M.: Sensor scheduling for target tracking by sub-optimal algorithms. IEEE Transactions on Vehicular Technology (2009) (accepted for publication) 10. Evans, J., Krishnamurthy, V.: Optimal sensor scheduling for hidden markov models. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP 1998, vol. 4, pp. 2161–2164 (1998) 11. Kalman, R.E.: A new approach to linear ﬁltering and prediction problem. Transaction of ASME, Journal of Basic Engineering on Automatic Control, 35–45 (March 1960) 12. Song, E., Zhu, Y., Zhou, J.: The optimality of kalman ﬁltering fusion with crosscorrelated sensor noises. In: 43rd IEEE Conference on Decision and Control, 2004. CDC, vol. 5 (December 2004) 13. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn., vol. 1. Athena Scientiﬁc (2005) 14. Williams, J.L., Fisher, J.W., Willsky, A.S.: Approximate dynamic programming for communication-constrained sensor network management. IEEE Transactions on signal Processing 55(8), 4300–4311

Route in Mobile WSN and Get Self-deployment for Free K´evin Huguenin1 , Anne-Marie Kermarrec2, and Eric Fleury3 1

IRISA / ENS Cachan INRIA Rennes - Bretagne Atlantique INRIA Lyon - Rhˆ ones-Alpes / ENS Lyon

2 3

Abstract. We consider a system consisting of a set of mobile sensors. They are disseminated in a region of interest and their mobility is controlled (as opposed to mobility imposed by the entity on which they are embedded). A routing protocol in this context enables any point of the region to be reached starting from any node, regardless of the initial sensor deployment. This operation involves message forwarding and/or sensor motion. In this paper we present Grasp, a GReedy stAteless Routing Protocol for mobile wireless sensor networks (WSN). Grasp is simple and independent from the underlying communication model, but still provides results close to the optimal, with respect to the self-deployment of sensors over a given region. It ensures that (i) routing is always possible in a mobile WSN irrespective of the number of sensors, and (ii) above a given number of sensors in a considered zone the protocol eventually enables the routing to no longer require sensors to move, which yields to self-deployment. With Grasp, sensors autonomously reach a stable full coverage following geometrical patterns. This requires only 1.5 times the optimal number of sensors to cover a region. A theoretical analysis of convergence proves these properties. Simulation results matching the analysis are also presented.

1

Introduction and Background

Consider a troop of human agents deployed in a region to accomplish a mission and assisted by a set of networked mobile devices, whose mobility is controlled (as opposed to mobility imposed by the entity on which they are embedded). An agent may order a mobile device close to him to perform an action at a given location of the region. Note that the action can eventually be performed by any node of the system. This is achieved by routing to this speciﬁc point. To perform routing a node can either (i) perform the action itself if possible, (ii) forward the request to another node or (iii) move. Such systems have numerous applications, a typical one being a situation where a military unit secures a sensible zone with the help of mobile sensors able to move, detects enemies (sensors) and raises the alarm. Another possible application is the case of a brigade of ﬁreﬁghters with mobile air-pressurized water autonomous ﬁre extinguisher. It enables robots to be deployed in a zone where human cannot be yet. Once the robots have secured B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 201–215, 2009. c Springer-Verlag Berlin Heidelberg 2009

202

K. Huguenin, A.-M. Kermarrec, and E. Fleury

this zone (say extinguish the ﬁre), the robots can then self-deploy to monitor the zone. Similarly to traditional static sensor networks, the coverage of the region and the connectivity of the system reﬂect the quality of the deployment. Even though any point can be reached irrespective of the deployment by having nodes move, the movements incur delays. A routing algorithm must be evaluated against two metrics: reactivity in a dynamic scenario (when the troop progresses in the region, referred as the transient state) and self-deployment in a stable scenario (when the troop secures the zone inside which it is deployed, referred as the steady state). The aforementioned applications require a routing algorithm capable of leveraging the node’s communication and sensing capacities as well as the nodes’s mobility to fulﬁll a request from a user, at any position. On one hand, a large amount of research has been devoted to ﬁnding deployments that ensures the connected coverage of a region of interest (ROI), thus leading to the design of optimal conﬁgurations [1]. Although self-deployment techniques allow us to autonomously reach such conﬁgurations by the mean of robotic sensors [2], they require an a-priori knowledge of the ROI at each node and are therefore not able to deal with evolving region and dynamic sensor relocation. Moreover, a traditional static (i.e. which does not leverage the node mobility) routing algorithm may not be able to cope with disconnected network or sensing node outside of the network scope. In [3], Butler and Rus considered a similar application. Assuming that sensors are notiﬁed of events occurring in their environment, Butler and Rus proposed an eﬃcient decentralized self-deployment algorithm making the nodes converge reactively in the most interesting portions of the region while still ensuring full coverage. Yet powerful, this algorithms may not match our applicative context since (i) it requires an underlying protocol to advertise to the nodes the location of relevant events and (ii) it does not address the problem of routing in this context. On the other hand, a lot of research on routing in disconnected networks and Delay Tolerant Networking [4] (DTN) in Mobile Ad-Hoc Network [5] (MANET) have been conducted. That led to the design of eﬃcient routing algorithms and powerful mobile computing systems, such as Car-Tel [6]. Yet, except from message ferrying approaches – which exploit mobile nodes with uncontrolled but predictable mobility [7,8], and MORA[9] - a motion planning based routing algorithm – which exploits the controlled mobility of a small set of autonomous agents; these algorithms cope with the node uncontrolled mobility rather than leveraging it for routing purposes. To the best of our knowledge, the two research topics of mobile routing and self-deployment have been mostly studied independently. Our claim is that using a routing algorithm leveraging the node mobility allows to cope with dynamic ROI, ineﬃcient deployment of the nodes and under-dimensioning (i.e. the number of sensors is not suﬃcient to coverage the ROI in a connected way). In addition, it provides self-deployment for free in steady state.

Route in Mobile WSN and Get Self-deployment for Free

203

Our contribution is two-fold. First we present Grasp, a simple routing algorithm acting only with local knowledge of the network and no knowledge of the ROI (i.e., the nodes do not know its size, its shape or its borders) and prove that beyond its simplicity Grasp ensures both sensing request fullﬁlment with probability one regardless of the network conﬁguration, and a convergence toward connected-coverage of the ROI with a required number of node close to the optimal in steady state. Then we investigate the properties required for a generic mobile routing algorithm to provide self-deployment. We demonstrate by considering practical matters that Grasp may be used in a real-environment. We focus on 2D ROIs. The analysis of 1D shows that Grasp is optimal, details are provided in [10]. The rest of the paper is organized as follow: Section 2 presents the design rationale behind Grasp along with the detailed algorithm. Section 3 provides a theoretical analysis of Grasp with respect to the self-deployment properties. Section 4 presents experimental results obtained through computer simulations that match the theoretical analysis. This section also gives a performance analysis of Grasp with respect to (i) its eﬃciency in terms of routing delays and energy, and (ii) its impact on the network topology in terms of self-deployment. Sections 5 tackles the practicality of Grasp by suggesting an algorithm to handle concurrent message routing and a sleep-wakeup scheme leveraging the network geometry resulting from Grasp to increase the system lifetime. Finally, we provide in Section 6 a list of perspectives and on-going work to increase Grasp performance.

2

GRASP: A Routing Algorithm for Mobile WSNs

As stated in the introduction, neither traditional routing algorithms nor selfdeployment techniques can be used for the targeted applications. In this section we present Grasp, a routing algorithm leveraging the node mobility to cope with dynamic ROIs and sparse or non-homogeneous deployments. 2.1

System Model

We consider a network of mobile entities with wireless communication capabilities deployed in an obstacle-free region. We assume a disc model for sensing (i.e., a node is able to sample its environment up to a distance rs from its current position) and symmetric reliable communication links. In the sequel we denote by routing the action of making any node fulﬁll a sampling request, i.e., sense at a given location in the ROI, emitted at a given node in the network. We further assume that nodes are able to orientate and localize themselves inside the ROI by the use of a compass and a localization system such as a GPS or a distributed location algorithm [11]. In addition, we assume that they also know their neighbors’ coordinates, using for instance periodic beacons.

204

2.2

K. Huguenin, A.-M. Kermarrec, and E. Fleury

Design Rationale

The design rationale behind Grasp can be explained through the analogy with a soccer game. In a game, a set of mobile intelligent entities, namely the players, are deployed on a rectangle area, the pitch, and collaborate in order to deliver a packet, the ball, at a given position, the goal. To succeed, the players can either run or pass the ball to a team-mate provided that the distance between them is not too large. Obviously, passing the ball to a team-mate is less tiring than running to put the ball in the goal. In this context, the energy constraint is that the players must keep the ability to move until the end of the game. On one hand, the players must pass the ball as often as possible so as to save their energy, but on the other hand, at some point, a player’s reachable team-mates may all be in worst position than himself to reach the goal. Two questions arise: assuming a limited view of the game and limited passing capabilities, (i) “to which of its reachable team-mates should a player pass the ball?”, and (ii) “when should a player move instead of passing?”

xdest

xdest

xdest

x0

x0

x0

(a)

(b)

(c)

×

×

×

×

hole

(d)

Fig. 1. Illustration of a routing hole in a WSN. (a) Node x0 is closer to the destination xdest than its neighbors making greedy routing fail. (b) Face-based routing algorithms route along the edges bordering the void while (c) Grasp leverages the node mobility to move toward the destination until greedy forwarding is possible. (d) Sample path using Grasp. Dashed lines denote greedy forwarding and solid lines and arrows denote respectively hole circumventing and moving.

Following this analogy, we propose a simple geographic routing algorithm leveraging nodes mobility to transparently cross holes in the topology. Nodes act greedily for both forwarding and moving: the distance between the current position and the destination should always be decreasing. If the destination lies in the sensing disc of the node in charge of the request (i.e., its distance to the destination is lower than rs ), the node fulﬁlls the request itself. If not, it can either forward the message or move as follow: Forward: using local information on its neighbors’ position, which is updated by means of periodic beacons, the node in charge of the message picks the closest node, and closer than itself, to the destination – if any – and forwards. Move: otherwise, the node starts moving in straight line toward the destination until it can either forward the message to a node closer than itself to the

Route in Mobile WSN and Get Self-deployment for Free

205

destination or sense itself at the destination. The node stops and performs the adequate action. Algorithm 1 gives a detailed pseudo-code version of Grasp. Figures 1(a)-1(c) illustrate the way Grasp deals with routing holes as compared to traditional static geographical routing algorithms [12] (such as GPSR [13]) and Figure 1(d) shows a sample path – combining both forwarding and moving – used by Grasp to deliver a message in a sparse 2D mobile WSN. Algorithm 1. Grasp: A routing algorithm for mobile WSN Input: upon reception of a sampling request (for position xdest ) at node x0 while d (x0 , xdest ) > rs and / ∃x ∈ neighborhood (x0 ) | d (x, xdest ) < d (x, x0 ) move toward xdest end while if d (x0 , xdest ) ≤ rs then sample xdest else {exists a node in x0 ’s neighborhood closer to xdest than x0 } forward to argmin d (x, xdest ) x∈neighborhood(x0 )

end if

The strengths of the algorithm described above are its simplicity and the resulting characteristics. First, no assumption is made on the radio communication model. More speciﬁcally, the traditional unit disc model is not assumed and the communication range is not explicitly used by the algorithm. Actions are only based on a pure localized and distributed information built from periodic beacons advertising the identiﬁer and the position of a node. The second important characteristic relates to the deployment: Grasp acts with zero knowledge on the ROI. Grasp is purely distributed and decentralized. Its ultimate goal is to allow an action to be performed at any location of the ROI. The deployment of the nodes is not explicitly controlled by the geographical shape of the ROI but by the application requirements. If an agent needs an intervention at speciﬁc positions, it sends requests toward these positions. Nodes route requests and thus potentially move in order to satisfy them. The deployment of the nodes is dynamic and self adapts to the shape and to any potential evolution of the ROI. Assuming no packet loss and no node failure Grasp provides sampling request fullﬁlment with probability one, regardless of the number of sensors: at each step the distance between the node in charge of the message and the destination is reduced either by moving toward the destination or by forwarding the message to a closer node. Therefore, Grasp outperforms traditional geographic routing as it takes beneﬁt of the nodes mobility to overcome dead-end routes. On the other hand, Grasp transparently increases the coverage by automatically ﬁlling routing holes. Intuitively, a node is required to move when the area between itself and the destination does not contain any other node. Therefore, moving toward the destination ﬁlls the routing hole. More concretely, Grasp oﬀers a spreading

206

K. Huguenin, A.-M. Kermarrec, and E. Fleury

property in the sense that, in addition to ﬁlling routing holes, when making a node move, it may not get closer to any other node than a given distance called repulsion radius, close to the communication radius (a closed form expression of the radius is given in Section 3.1). Moreover, the deployment of nodes is on demand. Such a reactive behavior does not require any speciﬁc hole detection or a pre-deployment computation phase. In addition, all properties of Grasp hold with evolving, in shape and size, regions of interest without explicitly requiring to be aware of such changes. Note that we do not establish one rigid path from one node to another: routing a packet from the same source node to the same destination may require some nodes to move. This is not a burden as most of the targeted applications require to send an order without expecting an answer right away. For instance, the ultimate goal of our example application is to have any node check at a given location and raise an alarm if required but not to send back any speciﬁc piece of information.

3

Theoretical Analysis

In this section, we present a theoretical analysis of our routing protocol for mobile WSN with respect to self-deployment. We denote by an stable deployment a network conﬁguration in which any point of the ROI can be reached (by forwarding and then sensing), without requiring any node to move, while using the greedy routing algorithm presented in Section 2. Due to the greedy nature of the forwarding algorithm, such a conﬁguration should provide full greedy-connectivity. Our analysis considers the case where the sensing radius rs is equal to the communication range rc . We make assumption on the disk model only to derive analytical results. Note that relaxing the assumption rs = rc = R impacts on the optimal conﬁguration but the general sketch of the proofs still holds. The purpose of this paper is not to study exhaustively all the possible ratio values between rs and rc but to present a formal framework for Grasp. We prove that Grasp converges, to a sub-optimal stable configuration (i.e. a stable conﬁguration using the 1.5 times the minimum number of nodes) provided that the number of nodes is suﬃcient. 3.1

Background

A recent study by of Iyengar et al. [14] explores the problem of the optimal deployment of a static WSN ensuring a connected-coverage of a region (i.e., full coverage with full connectivity) focusing on the case rs = rc = R. Using geometric considerations, they derive a lower-bound on the optimal node density to cover a zone in a connected way and they propose a strip-based conﬁguration which tightly approaches the bound. In addition the strip-based conﬁguration is asymptotically optimal. √ This conﬁguration is composed of horizontal connected strips spaced by 1 + 3/2 R. This way, any two nodes on the same line can communicate. On the other hand, a vertical strip of nodes connects the horizontal

Route in Mobile WSN and Get Self-deployment for Free

(a) Strip deployment.

207

(b) Hexagonal deployment.

Fig. 2. Node deployments ensuring (a) asymptotically optimal connected-coverage and (b) optimal greedy connected-coverage

strips together ensuring the full connectivity of the network. However, a greedy geographic algorithm may not be able to reach any point of the ROI. Figure 2(a) depicts the strip-based deployment. In [1], Bai et al. extended those results by proving the √ asymptotic optimality of the strip-based deployment pattern for any rc /rs < 3 (not only rc = rs ). In their work, Iyengar et al. considered full connectivity which characterizes a conﬁguration where there exists a path between any two nodes. However, our work focuses on network deployment where a greedy geographic algorithm can find a path between any two nodes. The full greedy connectivity can be formalized as follow: for any destination point of the ROI, any node x0 in the network can communicate with a node x1 closer to the destination than x0 . Based on this deﬁnition, it can be proved [10] that the hexagonal lattice (see Figure 2(b)) is the optimal deployment ensuring both full greedy-connectivity √ and the required density of sensors to cover the area using this mesh is 4 3/9R2 . In addition this conﬁguration ensures full coverage of the ROI provided that the sensing radius is greater or equal to the communication radius. 3.2

Proof of Convergence

A ﬁrst evidence is that nodes can not get closer from each other than a given characteristic distance δ called the repulsion radius. Consider a node x0 moving toward a point of the ROI xdest . Node x0 crosses the communication disc of a node x1 if its distance to the destination when it enters in x1 ’s communication disc is smaller than the distance between x1 and xdest . That is, using the notations in Figure 3(a): y ≥ R2 − δ 2 + y 2 − δ 2 . 2 This inequality holds for any y ≤ R / 2 1 − (δ/R) . The condition y > R √

implies δ > 23 · R. Therefore, even if a node x0 may cross the R-disc centered on a node x1 when traveling toward√its destination xdest , the distance between x0 and x1 stays always larger than 23 · R. We denote by repulsion radius this

208

K. Huguenin, A.-M. Kermarrec, and E. Fleury

minimum distance. Note that the repulsion radius is smaller than the communication radius. One may thus model all nodes by physical soft balls of radius √ ranging from 23 R2 to R2 . Based on the physical model analogy, running Grasp on a mobile WSN can be thought of as packing a set of balls inside a given frame [15]. As one may recall, the intrinsic action of Grasp on mobile nodes is rather to push them than to pack them. Each couple (x1 , x2 ) of two adjacent balls (d(x1 , x2 ) ≤ R) presents two attraction sites (see Figure 3(b).) An attraction site is a place where the probability to move, in order to sample a location behind the line x1 x2 , is null. Note that such a position also minimizes the size of the associated Vorono¨ı region (the set of destinations making a node located at this position move to fulﬁll a sampling request.) If a ball x0 is not located in an attraction site, in other words, if it is not adjacent to both x1 and x2 but only to one of them (x2 without loss of generality) then for each sampling location behind the line x1 x2 , it has a strictly positive probability to move. More precisely, if the sampling location is located on the left of the median of [x0 x2 ] and behind the line x1 x2 , the probability to move is non null (related to the hatched area on Figure 3(c).) Moreover, moving toward a sampling location inside this area pushes the node closer to the attraction site and the ball x0 then touches x1 . During the process, additional nodes may also arrive and stick to one another. Eventually, all nodes will converge to an attraction site with no more possibility to escape. Such ﬁnal conﬁguration is the well-known stable triangular lattice. Note that we do not claim that only triangular tiles can be formed but rather that they are the most likely to be formed. More speciﬁcally, hexagonal tiles allow greedy routing (as shown in the previous paragraph) and require less nodes but they are not stable. Eﬀectively, small deviations on the nodes’ position in an hexagonal tile may lead to a reconﬁguration of the tile into a triangular-like tile: a node routing toward the opposite node in the hexagon would move and stop near the center of the hexagon forming equilateral triangles.

x1 R

y

xdest ×

x1

x2

x1

x2

δ

x0

x0

(a)

(b)

(c)

Fig. 3. Repulsion√radius and attraction sites: (a) two nodes running Grasp cannot get closer than δ = 23 R ; (b) the two points which forms, together with x1 and x2 , an equilateral triangle are attraction sites where a node does not need to move to route a message behind the line (x1 x2 ). (c) Routing toward a destination in the hatched area forces x0 to move. Plain lines delimit the Vorono¨ı region associated to the nodes.

Route in Mobile WSN and Get Self-deployment for Free

4

209

Simulation Results

In this section we present the results of computer simulations and show that they closely match the theoretical analysis conducted. 4.1

Experimental Setup

In order to stretch the self-deployment capabilities of Grasp, sensor nodes are deployed uniformly at random in a restricted area of the ROI. The considered region is a square folded into a torus shape (note that we use a toric ROI for the sake of simplicity, using a square ROI only causes minor changes on the geometric deployment pattern near the borders.) It requires 500 nodes to be connected-covered, with respect to the expressions of optimal density given in Section 3. In order to assess the threshold beyond which Grasp converges, we consider several values of ρ = N/Nopt , close to 1.5. This enables us to study the behavior of Grasp under slight under or over-dimensioning in the number of nodes. The results can be used to evaluate the impact of failures on the deployment. The system dynamic is controlled, via Grasp, by message emission. We use a constant message emission rate λ, uniform amongst the nodes. Thus, during one time unit λ · N messages are routed from a randomly chosen node drawn from a uniform distribution on the full set of nodes to a destination drawn from a uniform distribution on the entire ROI. For the sake of simplicity, we consider the routing of one message at a time, the case of concurrent routing being tackled in Section 5.1. In this context, we evaluate Grasp’s behavior and convergence properties along the following metrics: ˜ this Average distance covered per node to deliver a message d: metric reﬂects the uniformity of the node deployment and thus the distance between the current conﬁguration and the optimal deployment. An optimal deployment being a network conﬁguration where any point of the ROI can be reached without moving, d˜ should decrease to zero as time tends to inﬁnity (assuming a constant emission rate λ) provided that the number of nodes is suﬃcient. In addition, assuming that a move is much more time-consuming than a wireless transmission, this metric reﬂects (i) the global mechanical (i.e., the movement) energy consumption of the system and (ii) the average delivery delay. Distribution of the average distance covered by a node to deliver a message pD : this metric brings additional information on the network load. More speciﬁcally, it reﬂects how the mechanical energy consumption is distributed amongst the nodes. The probability density function of d in a stable conﬁguration is an impulse, located at zero. Probability that at least one node moves to deliver a message pm : the order of magnitude of the delivery delay is dependent on whether the messages are delivered using only wireless communication or by moving to

210

K. Huguenin, A.-M. Kermarrec, and E. Fleury

cross routing holes, regardless of the distance covered. Thus, pm is a good ˜ indicator of the system quality of service provided by the system, dual to d. 4.2

Evaluation

We present simulation results obtained by averaging over 25 runs of Monte-Carlo simulations. In each conﬁguration, the metrics are computed by running 20 × N independent simulations. For instance, d˜ is evaluated by averaging the distance covered by a nodes to deliver 20 × N random messages.

(a) t = 1

(b) t = 5

(c) t = 50

(d) t = 1000

Fig. 4. Evolution of the network topology over time. A hundred nodes are initially uniformly deployed on a small area (1/100th) of a 2D toric ROI. The message emission rate is set to λ = 10. The plain lines represent the communication graph. For the sake of simplicity, edges between border vertices are not represented. Note the existence of an hexagonal tile and a few square tiles in the communication graph.

Evolution of the network topology. Figure 4 depicts the evolution over time of the network topology. To match our theoretical analysis, we consider in this experiment 1.5 times the number of nodes required for an optimal deployment with the minimum number of nodes. We observe that starting from a localized initial deployment over the ROI, Grasp converges to a stable deployment (i.e., connected-coverage). Figure 4(a) shows that the ﬁrst stages of the deployment spread the nodes inside the entire ROI. Figures 4(b)-4(d) illustrate the aggregation process and the ﬁnal conﬁguration described in Section 3.1. Average distance covered. Figure 5(a) depicts the evolution of the average dis˜ We consider a network with a tance covered by node to deliver a message d. number of nodes equal to ρ = 1, 1.25, 1.5 and 1.75 times the optimal. As expected from Section 3.1, d˜ tends to a non-null limit for ρ < 1.5 and to zero otherwise. This matches the theoretical results, since the triangular lattice formed by Grasp requires 1.5 the minimum number of nodes to ensure greedy connectedcoverage. Yet after 20 iterations, d˜ ≈ 4.10−4 R (when ρ = 1.5 and 2.10−4 R when ρ = 1.75) which is negligible: for a radius of 100m, each node would be expected to move on average of 4cm. Note that using only 17% (1.75/1.5) more nodes than needed reduces the average distance covered by a node by 50%.

Route in Mobile WSN and Get Self-deployment for Free 0.016

0.012

ρ=1 ρ = 1.25 ρ = 1.5 ρ = 1.75

0.4 Fraction of the nodes

Average distance covered

0.5

ρ=1 ρ = 1.25 ρ = 1.5 ρ = 1.75

0.014

0.01 0.008 0.006 0.004

211

0.3

0.2

0.1

0.002 0

0

0

10

20

30

40

50

60

70

80

0

90

0.005

(a)

0.015

0.02

(b) 1

Fraction of the nodes that move pm (for one request)

0.01

Average distance covered

Time

ρ = 1.5 ρ = 1.75 ρ=2

0.8

0.6

0.4

0.2

0 0

10

20

30

40

50

60

70

Time

(c) Fig. 5. Evaluation of Grasp in a 2D toric ROI: (a) average distance covered per node (relatively to R) as a function of time (λ = 1) ; (b) load distribution in the systems with respect to the average distance to be covered by a node to deliver a message (relatively to R) ; and (c) the probability that Grasp uses mobility to deliver a message as a function of time (λ = 10)

Network load. Figure 5(b) depicts the distribution of distances that nodes need to cover to deliver a packet after 20 time units. When the number of nodes is suﬃcient for Grasp to converge (i.e., ρ ≥ 1.5), the curves are decreasing and show that convergence is in progress. The high fraction of nodes which never move (up to 30% for ρ = 1.5) together with the shape of the histogram reﬂects the existence of large greedy-connected components with low redundancy inside each of them: the nodes inside the greedy connected components never move and the ones on their border moves to ensure connectivity between the components. The fact that the distance covered by the node is very low (at max 1% of the communication radius for ρ = 1.5) implies that the components are spatially extended, reﬂecting a low redundancy (with respect to the initial deployment where most of the nodes never move but where the ones that move cover on average 1/4th of the ROI.) When the number of nodes is not suﬃcient for Grasp to converge (i.e. ρ < 1.5), the network load has a bell-shape with a maximum at d = 25.10−5R (for ρ = 1), a heavy tail for the large values and a non-negligible fraction of still nodes (15% for ρ = 1.) Therefore the network load in terms of distance covered is well balanced between the nodes. Thus the energy needed for routing is evenly shared between a large fraction of the nodes resulting in an extended lifetime of the system.

212

K. Huguenin, A.-M. Kermarrec, and E. Fleury

Probability to move. Figure 5(b) presents the probability that at least one node moves to deliver a message for a number of nodes suﬃcient for Grasp to converge. With the minimum number of nodes (ρ = 1.5), pm decreases quickly to 20% and then converges slowly to zero. For slightly higher values of ρ, the ﬁrst decreasing stage is drastically speeded-up: pm ≈ 2% after less than 20 iterations. Summary Simulations conﬁrmed the theoretical results presented in Section 3. Grasp requires only 1.5 the minimum requested number of nodes to converge resulting in a triangular lattice deployment. Not only do we believe that this is a reasonable bound but this increased number of nodes is leveraged in several ways. First, the triangular lattice provides, over the other regular lattices (including the hexagonal optimal deployment), an increased resilience to failure as each node is provided with two potential neighbors in any direction. Second, as we will explain in Section 5.2, the fact that the triangular lattice is a subgraph of the hexagonal one can be exploited by a clever sleep-wakeup scheme to increase the lifetime of the system of 50%, fully justifying the 1.5 number of nodes over the minimum.

5

Considering Practical Matters

Most of the results presented above are of theoretical nature, yet we believe that Grasp can be eﬃcient in practice. So far, we have assumed that only one routing operation was processed at a time, Section 5.1 provides an algorithm to handle concurrent routing operations as this will happen in practice. Second, Section 5.2 provides a sleep-wakeup scheme which improves the system lifetime. 5.1

Handling Concurrent Routing

So far, only one sensing request at a time was considered. This simplifying assumption allowed us to derive formal proof of convergence of the network topology but might be restrictive. In this section, we consider concurrent routing operations. Grasp should keep ensuring that (i) routing operations eventually succeed and (ii) two nodes should not get closer to each other than a ﬁxed repulsion radius. We propose a set of modiﬁcations to Grasp so that such properties are still ensured and discuss their eﬃciency with respect to delays and distance covered. Priority queue. We assume that each node maintains a queue of messages ordered by priority. To each message is associated a Time from Emission (TFE), updated every time a message is inserted or extracted from the priority queue. The message with the highest TFE being the head. The node movement is driven by the message being processed, namely the queue’s head. Note that the number of older messages than a given message is ﬁnite and decreasing. This ensures that the delivery of messages is eventually guaranteed regardless of the heuristic chosen to forward the message.

Route in Mobile WSN and Get Self-deployment for Free

213

Opportunistic forwarding. The simplest forwarding heuristic is to choose a static node to forward to. A less conservative heuristic is to forward a message to a node which is closer to the destination and, if moving, gets closer to the destination. To this end, nodes exchange their speed vector, piggybacked in beacons and fully characterized by the node’s position and the destination of the current message being processed. A third heuristic, ensuring the repulsion radius between any pair of nodes, consists in considering the case of two moving nodes running into each other. Under this heuristic, two nodes getting in contact should be repulsed from one another, in analogy to the billiard model [15]. To this end, the two nodes exchange their positions and their current head (destination and age). The node the closest to the destination of the oldest head takes over all the messages and start processing them. The second node merely stops. This ensures that the two nodes are Moving away from one another. 5.2

Sleep-Wakeup

As proved in Section 3 and demonstrated in Section 4, a network of more than 1.5 the optimal number of nodes running Grasp converges to a triangular like lattice. Interestingly enough, the triangular lattice is a subgraph of the optimal hexagonal one. Based on this remark, we propose a simple but yet powerful sleep-wakeup scheme leveraging the network topology to increase the system lifetime. The triangular lattice (Figure 6(a)) is the union of three hexagonal lattices (Figures 6(b)-(d)), each node of the triangular lattice belonging to exactly two of them. Our sleep-wakeup scheme can be described as follow: Clustering: the ﬁrst step is to detect the triangular lattice-based components, the sleep-wake up algorithm being executed independently in each one of them. We assume that each node maintains the identiﬁer of the component it belongs to. A node is able to determine if it is located at the center of an hexagonal tile using its neighbor’s coordinates. A node at the center of an hexagonal tile sets its component identiﬁer to its own identiﬁer and forwards it to its neighbors forming the hexagonal tile. Otherwise, the component identiﬁer remains unset. Upon reception of such a message, a node updates its component identiﬁer if it is still unset or lower than the one received. Only nodes located at the center of an hexagonal tile forward it to hexagonal tile vertices. Otherwise, the message is ignored. Eventually, the nodes inside a triangular-lattice based component share the same component identiﬁer. The node whose identiﬁer is the one of the component is the natural leader of the component. This clustering task is periodically executed. Sleep-wake up periods: each triangular component of the network alternates between three hexagonal conﬁgurations in a round-robin manner. The leader (node L in Figure 6) is in charge of spreading the sleep messages to the centers of the hexagonal tiles of the current conﬁguration. The eﬀect of a sleep message is to put in sleep mode the node receiving it for a given period of time. Leveraging the geometry of the network, the spreading of sleep messages can be done optimally with respect to the number of packets sent.

214

K. Huguenin, A.-M. Kermarrec, and E. Fleury

L

L

L

L

(a)

(b)

(c)

(d)

asleep awake

Fig. 6. Sleep-Wakeup scheme: (a) nodes inside a triangular lattice adopt in a cyclic manner three sleep-wakeup policies (b),(c) and (d). For the sake of readability, the leader L is reported in the three figures as a reference point. In each configuration, one third of the nodes are asleep and the communication graph is an hexagonal lattice.

Consider a system with the minimum required number of nodes deployed on an hexagonal lattice and assume that each node has a lifetime of one time unit, the system lifetime being also one time unit. Now, consider a system of 1.5Nopt nodes using the sleep-wakeup scheme presented above. Setting its working period (the time length during which the system stays in each of the three conﬁgurations) to 0.25 time unit, the global system lifetime is extended to 1.5 time units (i.e. six periods). Eﬀectively, each node is asleep two periods out of six yielding a total number of awake periods of four, that is one time unit. In other words, the additional number of nodes over the optimal, namely 1.5, is fully leveraged by increasing the lifetime of the system up to 1.5.

6

Conclusions and Future Work

In this paper, we considered a network of mobile wireless sensors. Their mobility being controlled, we proposed Grasp, a novel and simple stateless algorithm which leverages nodes mobility to route sensing requests. Grasp transparently adapts to evolving region of interest, with respect to size or shape, without requiring to be explicitly aware of such changes. Our algorithm is independent from the communication medium and uses very simple forwarding and motion planning techniques. Thus it is directly applicable to any low capabilities wireless network. Assuming a disc model for communications and a random choice of the sensing locations inside the region of interest, we proved that a network running Grasp converges to a conﬁguration ensuring greedy-connected coverage of the region. In that sense, the simplest routing algorithm leveraging nodes mobility provides the network with self-deployment properties for free. The number of nodes required to ensure convergence is 1.5 the optimal. Our simulation results matched our theoretical analysis. Finally, we provided Grasp with an ad-hoc sleep-wakeup scheme extending the system lifetime up to 50% without jeopardizing the greedy connected-coverage. This fully justiﬁes the overhead factor to the optimal number of nodes. We also tackled concurrent requests routing. We plan to investigate these two tracks and evaluate Grasp behavior using a more realistic model for wireless communications.

Route in Mobile WSN and Get Self-deployment for Free

215

References 1. Bai, X., et al.: Deploying wireless sensors to achieve both coverage and connectivity. In: Mobihoc 2006, pp. 131–142 (2006) 2. Tan, G., Jarvis, S.A., Kermarrec, A.M.: Connectivity-guaranteed and obstacleadaptive deployment schemes for mobile sensor networks. In: ICDCS 2008 (2008) 3. Butler, Z., Rus, D.: Controlling mobile sensors for monitoring events with coverage constraints. In: ICRA 2004, 26-May 1, 2004, vol. 2, pp. 1568–1573 (2004) 4. Jain, S., Fall, K., Patra, R.: Routing in a delay tolerant network. In: SIGCOMM 2004, pp. 145–158 (2004) 5. Mauve, M., Widmer, J., Hartenstein, H.: A Survey on Position-Based Routing in Mobile Ad-Hoc Networks. IEEE Network Magazine 15(6), 30–39 (2001) 6. Hull, B., et al.: Cartel: a distributed mobile sensor computing system. In: SenSys 2006, pp. 125–138 (2006) 7. Zhao, W., Ammar, M.H., Zegura, E.W.: A message ferrying approach for data delivery in sparse mobile ad hoc networks. In: Mobihoc 2004, pp. 187–198 (2004) 8. Zhao, W., Ammar, M., Zegura, E.: Controlling the mobility of multiple data transport ferries in a delay-tolerant network. In: INFOCOM 2005, vol. 2, pp. 1407–1418 (2005) 9. Burns, B., Brock, O., Levine, B.N.: Mora routing and capacity building in disruption-tolerant networks. Ad-Hoc Networks 6(4), 600–620 (2008) 10. Huguenin, K., Kermarrec, A.M., Fleury, E.: Route in Mobile WSN and Get Self-Deployment for Free. Research Report 6819, INRIA (January 2009), http://hal.inria.fr/inria-00357240/en/ 11. Kwon et al.: Resilient localization for sensor networks in outdoor environments. In: ICDCS 2005, pp., 643–652 (2005) 12. Fang, Q., Gao, J., Guibas, L.: Locating and bypassing routing holes in sensor networks. In: INFOCOM 2004, vol. 4, pp. 2458–2468 (2004) 13. Karp, B., Kung, H.T.: GPSR: greedy perimeter stateless routing for wireless networks. In: MobiCom 2000, pp. 243–254 (2000) 14. Iyengar, R., Kar, K., Banerjee, S.: Low-coordination topologies for redundancy in sensor networks. In: Mobihoc 2005, pp. 332–342 (2005) 15. Lubachevsky, et al.: Spontaneous patterns in disk packings. In: Bridges 1998: Conf. on Mathematical Connections in Art, Music, and Science (1998)

A Sensor Network System for Measuring Traffic in Short-Term Construction Work Zones Manohar Bathula1 , Mehrdad Ramezanali1 , Ishu Pradhan1 , Nilesh Patel2 , Joe Gotschall2 , and Nigamanth Sridhar1 1 2

Electrical & Computer Engineering, Cleveland State University Civil & Environmental Engineering, Cleveland State University

Abstract. In this paper, we present the design and implementation of a sensor network system for monitoring the flow of traffic through temporary construction work zones. As opposed to long-term work zones which are common on highways, short-term or temporary work zones remain active for a few hours or a few days at most. As such, instrumenting temporary work zones with monitoring equipment similar to those used in long-term work zones is not practical. Yet, these temporary work zones present an important problem in terms of crashes occurring in and around them. Our design for a sensornet-based system for monitoring traffic is (a) inexpensive, (b) rapidly deployable, and (c) requires minimal maintenance. We report on our experiences in building this system, and with testing our system in live work zones in the Cleveland area.

1 Introduction Construction work zones on roadways are hazardous areas. Motorists are exposed to unfamiliar situations in a normally familiar setting, and such unexpected unfamiliarity could lead drivers to behave in unforeseen ways. Highway work zones are typically long-term installations that are put in place for several weeks, if not months. Such installations are instrumented with a wide variety of sensing and monitoring equipment for observing traffic behavior and for recording unexpected situations. In contrast, utility work commonly takes a few hours, at most a few days to complete. It is not economically feasible to instrument such work zones with the same kinds of equipment used in highway work zones. In fact, in almost every such work zone we see in our neighborhoods, there is no way of tracking and monitoring traffic. Consider the following scenario. The local electric utility company needs to perform maintenance on some street for which they need to encroach into a portion of the street. The utility workers bring their equipment in a utility truck, and before beginning work, they deploy construction cones to demarcate the work area, and to warn drivers. While this level of visual warning works well enough for motorists that are already driving on the street, a motorist who is a mile away, or even just around the corner, typically has no indication of the potential hazard.

This work was supported in part by grants from the NSF (CNS-0746632, CAREER and REU), a grant from the US DOT by way of the CSU UTC, and an undergraduate research Engaged Learning grant from CSU. Thanks are also due to Area Wide Protective for allowing us to use their live work zones as testbeds.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 216–230, 2009. c Springer-Verlag Berlin Heidelberg 2009

Sensornets for Measuring Traffic in Short-Term Work Zones

217

Further, in the scenario above, if a crash were to occur, the authorities are notified. From interviews of the people present at the time, the causes of the crash may be reconstructed. Such a reconstruction may be flawed: the driver in question may not divulge key errors on their part, witnesses may not have been paying complete attention, etc. Nevertheless, there is at least a record of an incident. There are cases, however, where a motorist may come close to crashing, but is able to recover at the last instant. Such near-crashes are never recorded. These near-incidents are important: the reason that the motorist was put in that situation may have had something to do with the design of the work zone. If such instances were recorded and correlated with work zone design, transportation safety engineers could work on avoiding similar cases in the future. Our work is motivated by collaborations with the CSU University Transportation Center [8]. This UTC is specifically focused on improving safety in work zones. Through the UTC, we worked with a local flagging company, Area Wide Protective (AWP), to define the problem space and to identify requirements (Section 2). We have designed a complete sensornet system for monitoring work zones. Our system collects data in work zones, and presents them for two kinds of uses: First, we provide summary information of traffic activity around the work zone for post facto analysis for correlating near-crash instances with work zone design, and second, we publish traffic statistics to the internet. Where wireless internet connectivity is available in the work zone, our system publishes real-time statistics to MSR SensorMap [21]. We make the following contributions in this paper: 1. The design and prototype implementation of a sensornet system to monitor traffic in short-term work zones. 2. Software architecture (implemented in TinyOS/nesC) for collecting a variety of traffic statistics, such as flow, density, vehicle trajectories, etc. 3. Examples of real deployment experiences with temporary work zones. The rest of the paper is organized as follows. We outline the design requirements that distinguish sensornet deployments in short-term work zones in Section 2. Following this, we describe our system architecture, and hardware and software design in Section 3. We present results from our evaluation and deployment experiences in Section 4. After discussing some of the lessons we learned during this research in Section 5 and related work in Section 6, we conclude in Section 7.

2 Design Requirements Deployment Requirements. Short-term work zones present some design requirements that are distinct from typical sensornet deployments: 1. Rapid deployment. These work zones are only active for a few hours and the network must be ready in a few minutes. 2. Inexpensive. The cost of sensornet hardware must be kept to a minimum. 3. No skilled maintenance. The nodes in these systems must not require any skilled maintenance from sensornet experts. 4. Self-organization. While sensornet deployments in short-term work zones are not completely ad hoc, they are not guaranteed to be “highly-engineered”: the placement of nodes in the network cannot be pre-determined.

218

M. Bathula et al.

Sensors Motes Multihop wireless Storage server

Internet Base station

Fig. 1. Deployment architecture. Data collected from the work zone is uploaded to a server for archival and analysis and to SensorMap.

Fig. 2. Our sensornet system deployed in a work zone. The motes are mounted on safety cones, which are placed as per the MUTCD.

Data Requirements. Based on our discussions with the researchers at the CSU UTC, the most important kinds of data that needed to be collected were: – Traffic statistics such as flow (vehicles per hour), density (average vehicles per mile), and average speed of vehicles traveling through the work zone. – Trajectories of vehicles as they travel through the work zone. When cars deviate from a uniform straight line, there is potential for crash incidents since they may come close to construction equipment or workers. – Aberrant behavior of vehicles. The design of a work zone is intended in such a way that vehicles will still be able to maintain uniform speed. Cases where vehicles suddenly brake, for example, may be indicators of unsafe situations.

3 System Architecture and Design Given the design requirements for this problem, we wanted to come up with the simplest design of a sensornet that would still be able to provide the appropriate kinds of data required. In order to gather traffic statistics such as flow, density, and average speed, a simple array of proximity sensors can be used to count vehicles that move past the array. In order to compute vehicle trajectories, the proximity sensors would not be sufficient themselves, since the distance from the sensor to the vehicle obstruction will also be needed. Accordingly, we use an array of ranging sensors (Section 3.1). Along the roadway of interest, an array of nodes with ranging sensors is deployed (Figure 1). Each sensor node is also capable of transmitting the sensed samples to a local base-station. The base-station is connected to a centralized server that is responsible for data archival and analysis. Most construction and utility trucks are equipped with a GPS receiver and a broadband internet connection, and our base-station can use this connection to access the Internet.

Sensornets for Measuring Traffic in Short-Term Work Zones

219

Sensor Placement. While the sensor node placement in the network is not highlyengineered, they are placed in a predictable manner. The nodes are placed along the side of the roadway being monitored such that the following assumptions are met: – The entire width of the roadway falls inside the sensitivity region of the sensors. – Separation between nodes in the network is uniform. This deployment architecture, and the assumptions it makes, is quite well suited for the target application. For one, the sensing hardware can be integrated easily in the work zone: they can be mounted on the safety cones. Further, road construction personnel in work zones already have specific parameters that they need to meet in order to put together a safe work zone. There are guidelines on distance between cones, and placement of cones. The Manual of Uniform Traffic Control Devices (MUTCD, chapter 6) describes the rules of how to place traffic control devices (safety cones, in this case) in short-term work zones [9]. These guidelines and practices can be easily exploited in the design of our deployment. Figure 2 shows a photograph of our sensor nodes deployed in a live work zone run by our collaborators, Area Wide Protective. We did not modify the work zone, and the placement of cones in any way. The black box (shown in inset in Figure 2) contains our sensor node hardware. The box itself it fastened to a plastic cup. This cup is placed on top of the safety cone. When mounted on the cone, the box is stable, while still being extremely simple to mount. 3.1 Hardware Design Figure 3 shows the internals of the sensor node unit used in our sensornet deployments. Each node has the following elements, all of them off-the-shelf components: 1. Processing and communication unit: We use a TelosB mote [24] in the box. The mote’s USB connector is exposed outside the box for programming and charging. 2. Sensors: We use an infrared ranging sensor to detect obstructions through holes drilled on the side of the box. There is also a magnetometer to classify obstructions. 3. Battery source: We use an Ultralife rechargeable lithium battery as the power source. Detecting Vehicles. To detect vehicles, we use a ranging sensor that can not only detect an intrusion, but also provide the distance of the intrusion from the sensor. We use a Sharp infra-red ranging sensor (GP2Y0A700K0F) [26]. This sensor has a sensing range of 5.5 m, and provides an analog voltage signal (0.5–3V) based on the distance of the reflective obstruction. Figure 4 shows the output voltage profile that the sensor provides as a function of the distance of the obstruction. The sensor needs an input operating voltage of 4.5–5.5V. We use a Maxim MAX756 step-up converter to convert the voltage from the battery to the required input voltage for the sensor. Battery Maintenance. The battery in the node is the only component that needs regular maintenance. In order to simplify maintenance, and to avoid replacing batteries in the box often, we use a rechargeable battery. Further, we connect a Telos charger board [23] to the TelosB mote, and connect the rechargeable Ultralife lithium battery to the board. Whenever the TelosB mote is plugged into a USB port, the battery is charged, and when the mote is not connected, the battery powers the mote.

220

M. Bathula et al.

3.5

Analog voltage of the sensor

3

2.5

2

1.5

1

0.5

0

Fig. 3. Our sensor node includes a TelosB mote with a ranging sensor and a magnetometer

0

1

2

3

4

5

6 7 Distance in Feet

8

9

10

11

12

Fig. 4. Output voltage produced by the Sharp GP2Y0A700K0F infra-red ranging sensor

3.2 Software Services and Design Collecting Sensor Data. Each node in the network periodically samples its two sensors to detect vehicles moving past the node. The collected samples are sent to the basestation for processing. A point to note here is that this network runs at a duty cycle of 100%. The goal of this system is to capture complete information about vehicle traffic, over a short period of time. Short-term work zones are only active for a few hours at a time. Between deployments, the batteries in the sensor nodes can be recharged. Time Synchronization. The work zones that our sensornet targets are short-term work zones deployed in urban streets. The maximum speed on these roads is 35 mph. Given this speed, and the sampling duration of the ranging and magnetometer sensors, we sample our sensors at 10Hz. The typical distance between nodes in short-term work zones is about 10 feet. Based on this inter-node distance, a car moving at 35 mph will take about 195ms to travel from one node to the next1 . We use the stabilizing clock synchronization protocol [12], which achieves an accuracy of 300µs. Network Self-Organization. Our first attempt at network organization was to use a neighbor localization algorithm using RSSI between nodes based on [13]. In our setup, neighboring nodes are roughly 10 feet apart. So each node needs to identify its two nearest neighbors (the two nodes that are 10’ on either side), and distinguish them from nodes that are further away. The key requirement, therefore, is that a node p should be able to distinguish between a node q that is 10’ away from node r that is 20’ away. However, our own observations were not as consistent as [13]. In fact, we were not able to distinguish between RSSI readings at all between nodes 10 and 40 feet away (Figure 5). What we observed was consistent with [28]: RSSI is good indicator of link quality at some levels, but it is not a good indicator of distance (at least at the granularity we were interested in). In [1], the authors discuss using statistical methods or neural networks to estimate distance. We abandoned this approach since these algorithms made our system too complex, and opted for a more simple, centralized, approach to network organization: using the time-stamp information contained in the sensor messages to order the nodes at the base-station (described below). 1

35 mph is about 51 feet per second, so the time to travel 10 feet is 10/51 s ≈ 195 ms.

Sensornets for Measuring Traffic in Short-Term Work Zones

221

140

45

Throughput achieved with spatial multiplexing Throughput achieved without spatial multiplexing Average throughput with spatial multiplexing Average throughput without spatial multiplexing

Power level 31 50

Power level 15

120

Power level 4 55

Throughput (percentage)

100

Rssi (dBm)

60

65

70

75

80

60

40 80

20 85

90

0

5

10

15

20

25

30

35

40

Feet

Fig. 5. RSSI at different distances

45

0

1

2

3

4

5 Node ID

6

7

8

9

Fig. 6. Spatial reuse and goodput

We use a simple minimum spanning tree as the routing structure to transfer data from the network to the base-station. Once the routing tree is formed, the base-station disseminates two pieces of information to the network: (i) the depth of the routing tree, and (ii) the inter-node distance. The base-station is provided with the inter-node distance at the time of deployment. This is the only parameter that the system needs. We keep this a deployment-time parameter because the exact physical separation between the safety cones is only known at the time of commissioning the work zone. Once the routing tree is formed, and the nodes are synchronized, they begin sampling their sensors to detect vehicle traffic. The samples are reported via multi-hop routing to the base-station, which can reconstruct vehicle paths using the time-stamp information available in the messages. Further, using the time-stamp information, the base-station can discover the topology of the network and the ordering of the nodes in the array: the time-stamps from different nodes tracking the same car will be in the order that the nodes are placed, since all nodes are synchronized. This simple sorting based on timestamps, and using the target being tracked itself for localization, turned out to be more accurate than using other distributed localization schemes. We use the first 50 samples from each node as a training set for the base-station to converge on the topology of the network2. Once the training period is complete, the base-station disseminates the topology information to the network. Sensor Sampling. The function of each node is to detect an obstruction that is crossing its “line of vision,” classify it as a vehicle, and to estimate the distance (from the sensor) at which the vehicle crossed. The ranging sensor that we described earlier (Section 3.1) provides the distance information. For classifying the obstruction as a vehicle of interest, we use the magnetometer. The magnetometer sample is only used for classification and is not reported. Each data sample consists of three fields: Timestamp (4bytes) Distance (1byte) Vehicle count (2bytes)

Every sensed sample is logged to the external flash on each node. This is done so that post facto analysis can recover data samples missed due to lost network packets. In addition, each node keeps a growing buffer of the recent samples that have not yet been uploaded to the base-station. These samples are uploaded to the base-station in batches. Data Reporting. Each cycle of data acquisition needs to sample the sensors, and then report the sampled data to the base-station, if a vehicle is detected. If all the nodes were 2

The training set is this big to remove errors caused by dropped messages, missed samples, etc.

222

M. Bathula et al.

transmitting their sensed data to the base-station at the end of every sensing cycle, the amount of wireless traffic in the network will be too high, leading to poor goodput. In fact, in our initial experiments, this is exactly what we observed: the yield of the network, even a small one with 9 nodes, was only around 73%. Instead, we use a delayed reporting scheme with the goal of improving goodput. The reporting scheme we use makes for spatial multiplexing similar to the Flush protocol [15], which is designed for bulk data transfer over large numbers of hops. Our networks are simpler in that the number of hops to the base-station is about 4–6. We used a simplification of the spatial reuse scheme by scheduling exactly one node to upload data in each slot. Using spatial reuse, we were able to increase the goodput of messages to nearly 100% (Figure 6). We design our batch uploads such that a node p1 can transfer all the data it has to upload to its parent p2 in the routing tree within the duration of time that a vehicle will take to travel from p1 to p2 . In this manner, if p1 begins the transfer immediately upon seeing a vehicle, then the transfer can be completed before p2 can see the same vehicle. The default TinyOS active message payload size used in the TelosB mote is 28 bytes [17]. In addition to the three fields above, each node will also need to include its node id (2 bytes) in each message. So each message can carry up to three vehicle samples (21 bytes), and the size of each message is 41 bytes including header and footer sizes. If the nodes in the network are placed dnode apart, the minimum time a vehicle takes to travel this distance is Tnode , the time taken for a message to travel from sender to receiver is tm , and hmax is the maximum hop count of any node in the network to node the base station, then the size of the buffer on each node is at most b = 3 × tmT×h . max In most urban work zones, the safety cones (and consequently, the sensor nodes) are placed 10 feet apart (dnode ), and the typical speed limit is 35 mph. So Tnode is about 195ms. The message delay (tm ) is about 8 ms for the 41-byte message [4], and in most of out test networks, the height of the routing tree (hmax ) is 3. So the size of the buffer on each node is 24: each node can cache 24 vehicles for each reporting cycle. We employ a mutual exclusion scheme to schedule data transfers from each node. Only the node that has the mutex token transfers data, and the other nodes in the network are either idle, or are participating in multi-hop routing. The first node in the network assumes the token to begin with. When this node has accumulated b samples, it begins the transfer process. The message transfer process is started immediately upon completing a sample; this way, the sender node knows that its immediate neighbor in the array will not see the same vehicle for Tnode , by which time all the data would have been transferred. After sending all the messages (b/3), it sends the mutex token to the next node in the array. The next node in the array now can begin its own data transfer process. This process continues until the last node has had a chance to upload its data. Notice that all nodes in the network can transfer their cached data in the time it takes for a single car to move through the network. Once the last node in the network has transferred all of its data in that round of transfers, the base-station disseminates a completion signal. This completion signal serves to hand the mutex token back to the first node in the network, and the reporting cycle repeats approximately every b vehicles. Computing Vehicle Trajectories. The basic idea behind our trajectory tracking system is quite simple: Whenever a vehicle crosses the sensing region of a sensor, the mote

Sensornets for Measuring Traffic in Short-Term Work Zones

223

takes a sample and sends a “sensor-to-target distance” measurement to the base-station. This message is packaged along with the node’s ID, and its local timestamp. For now, let us consider the simplest formulation of this problem: that there is only one vehicle moving through the array. We will discuss multiple targets later. The base-station learns the network topology during the training period, and can locate a node with ID k to a particular (xk , yk ) coordinate location. This coordinate location, in addition to the target distance, can be used to compute the target’s coordinate tk k location at time tk : (xttarget , ytarget ). With sensor readings from all the nodes in the array, the base-station can assemble an ordered list of points through which the target traveled. A simple curve passing through these points will give us an approximation of the actual path the vehicle took. However, the sensor, based on our calibration, has an error margin of about a foot. This is nearly 7% of the entire sensing range! To improve the accuracy of the trajectory mapping algorithm, we implemented a particle filter [7] algorithm based on the one in [27]. At the base-station, the particle filter generates a set of random points for each sensor sample. Based on a cost function (described below), the particle filter then prioritizes these points. The point from each set with the least cost function is picked as the candidate trajectory. In [27], the authors use a particle filter in order to detect multiple targets in a 1-dimensional space using binary proximity sensors. Our space is a 2-dimensional space, and we modified the setting accordingly. In our 2-dimensional space, the sensors are arranged along the xaxis, and we consider that the sensor’s range is a straight line along the y-axis. We use a combination of two cost functions, one for each of x and y dimensions. In the x dimension, we use an approach that uses the speed information of the target to bias the cost function. Based on the speed of the target calculated by pairs of sensors, the particle filter can find the most probable location of the target in the particle space. The trajectory computation is done at the base-station post facto, and the speed information is already available by then. In the y dimension, the cost function serves to eliminate changes in the trajectory of the target that are unrealistic. Most passenger cars are about 12’ to 16’ long. Given that the distance between our sensors are about a carlength or less, the amount of variance in the sensor-to-target distance reported by the sensor is limited. The cost function we use in the y dimension, therefore, is weighted to limit such unrealistic variations. At the same time, we do not want the filtering to miss actual variances in target trajectories (which is the whole point of this exercise). Detecting Aberrant Behavior. Sudden changes in speed of vehicles typically indicate potentially unsafe physical situations on the roadway. By calculating speed between every pair of sensor nodes in the array, we can get the speed of the moving vehicle in different regions of the work zone. Normally, one would observe a uniform speed, or a gradual increase or decrease of speed. Sudden fluctuations (e.g., 10% change within 20 feet) are triggers to flag a vehicle as moving in an aberrant fashion. The number of such instances are recorded, along with where in the work zone they occurred. By examining this data, deductions can be made about potential safety hazards in work zone design. Publishing Data for Wide Access. On a typical day, there are tens, even hundreds of short-term work zones that are active. Our partner, AWP, alone deploys a number

224

M. Bathula et al. 12

12

Distance in feet

Distance in feet

8 6

Actual Path path calculated without particle filter Distance of the target from the sensor Path calculated with 1D particle filter Path calculated with 2D particle filter

4 2 0

Actual path Path calculated without particle filter Distance of the target form the sensor Path calculated with 1D particle filter Path calculated with 2D particle filter

10

10

8 6 4 2 0

0

10

20

30

40 Distance in feet

50

60

70

80

(a) Path 1

0

10

20

30

40 Distance in feet

50

60

70

80

(b) Path 2

Fig. 7. Comparing the actual path of a target across the sensor array, and the trajectory we computed. We drove our target along two different paths, and tested the three versions of trajectory mapping to approximate the target’s path.

of active work zones in the Cleveland area. One of the biggest problems with shortterm work zones is that there is typically no record of its existence. In fact, except for motorists that are driving along the street on which the work zone is commissioned, no one even knows about it. While traffic information on major highways in metro areas is already available in mapping services such as Google Maps [10] and Microsoft Live Maps [18], traffic delays caused by short-term work zones are not reported. Our basestation uploads synthesized traffic data to the internet (Micosoft’s SensorMap [21]).

4 Evaluation and Results 4.1 Estimating Vehicle Trajectories We tested our trajectory mapping algorithms in a testbed deployed in a parking lot with eight nodes. The nodes were placed ten feet apart from each other, in a straight line. One of the motes acted as the root of the collection routing structure, and communicated with a PC acting as the base-station. We drove a car in a pre-determined path as our target moving through the sensor array. Each sensor took 10 samples/sec. This sampling rate was sufficient to capture the target moving through the array, based upon the speed of the target moving across the array, and inter-node distance. Figure 7 shows the results of our experiments with two paths. In the case of each of the paths, four curves are shown. One of these is the actual path traveled by the target vehicle. The first calculated curve simply takes the sensor readings, directly. These readings, based on our sensor calibration tests, may be off by up to one foot from the actual path. The second calculated curve uses 1-d particle filter (along the y-axis) to better approximate the reading, and to compensate for sensor calibration error margins. The final calculated curve is the curve calculated using the 2-d particle filter. As one can see from all the three different paths we tested with, the accuracy of the computed path becomes better as we move from plain sensor calibration, to 1-d particle filtering, and finally to 2-d particle filtering. 4.2 Deployment Experiences We deployed our sensornet system on work zones commissioned by Area Wire Protective (AWP), a flagging company in Northeast Ohio. The company provides road work zone services to a number of utility companies in the area. When a utility company

Sensornets for Measuring Traffic in Short-Term Work Zones 50

35

50

10

45

9

40

8

35

7

30

6

25

5

20

4

15

3

45

10

5

20

15

10

40 35 30 25 20 15 10

10

Density

15

25

Flow rate

20

Number of cars changing speed by over 5mph

30

25

Cumulative time for traversal(seconds)

Speed variation as car moves in the workzone(MPH)

30

225

2

5 5

0

10

20

30

40

50

60

70

80

Feet

0

0 0

10

20

30

40 Distance in feet

50

60

70

80

5

10

20

30

40 Feet

50

60

70

0

1

5

10

15

20

25 30 35 40 Number of minutes after the deployment

45

50

0 55

(a) Average vehicle (b) Cumulative time (c) Number of cars (d) Flow and density speeds in different segments

of traversal for vehicles driving across the work zone

that changed speed by over 5 mph in adjacent 10 ft segments

in different work zone segments

Fig. 8. Some sample statistics we are able to collect using our sensornet

(gas, electric, cable) has to perform maintenance work that may cause traffic restrictions, AWP sets up a work zone for them to ensure safe operation. In this section, we report data we collected from one of these work zones in the Greater Cleveland area3 . The location of the work zone was on Lorain Road near the intersection with Clague Road in North Olmsted. This is a pretty busy road, and in one hour during our deployment, we observed 614 cars pass through the work zone. This work zone was about one hundred feet long, and occupied one lane of the street. The work area was in one of the drive lanes, and was about 20 feet long. The street had two drive lanes in either direction, and a turn lane in the middle. The work zone guided the traffic to merge from two lanes into one. We deployed our sensors to monitor traffic in the lane that carried the merging traffic. We videotaped the traffic during the deployment to compare with the data produced by the network to establish ground truth using traditional methods of traffic analysis. Average speed of vehicles. The speed limit on Lorain Road is 35 mph, and there was no reduction in speed limit caused by the work zone. There was a traffic light about 500 feet downstream from our work zone, and this caused some slowdowns and some stopped traffic as well. The average speed of vehicles driving through our work zone was about 12 mph. Further, we measured speeds between every pair of nodes, i.e., average speed in every 10-foot segment in the work zone. These speeds are shown in Figure 8a. Notice how the average speed of vehicles is lower immediately upon entry into the work zone, and just before exiting the work zone. Near the middle of the work zone, motorists generally tend to be “more confident,” and hence tend to speed up a little. In spite of the speed limit being 35 mph, we didn’t actually observe any vehicles traveling as fast. This was mostly because of the density of traffic, which was “bumper-to-bumper” for most of the time the work zone was active. As another measure of how fast vehicles are moving through the work zone, we show cumulative time-location plots of vehicles in Figure 8b. Looking at this figure, we can see that most cars spend about 10–20 seconds in the work zone, while a small number of them spend longer. Changes in speed. Sudden changes in vehicle speeds is another point of interest for work zone designers. If a number of vehicles suddenly changed speed particular spot in 3

The data collected from two other test deployments are similar in kind. The complete collection of datasets is available at http://selab.csuohio.edu/dsnrg

226

M. Bathula et al. 30

12

25

6 4 2 0

0

10

20

30

40 Distance in feet

50

60

70

80

12 10 8 6 4 2 0

Number of cars

8

Feet

Distance in feet

10

20 15 10 5

0

10

20

30

40 Feet

50

60

70

80

(a) The trajectories of 50

(b) The white region in the

cars out of our total set of 614. Note that this is just the side of the cars facing the sensors.

middle denotes the average trajectory. The shaded regions on denote the extent of deviation from the average.

0

1

2

3

4

5 Node ID

6

7

8

9

(c) Number of cars that went off the average trajectory as detected by each node in the work zone

Fig. 9. Trajectories of vehicles in the work zone

the work zone, that spot merits some special consideration. Figure 8c shows the number of cars that changed speed by over 5 mph in adjacent 10 ft segments. See the correlation between this graph, and the graph in Figure 8a: a number of cars speed up in the second segment, resulting in a higher average speed in the middle of the work zone. Near the end of the work zone, a number of cars reduce speed just before exit. Rate of flow and density of traffic. Figure 8d shows the rate of flow of traffic, and traffic density, during the hour of data capture. As we can see here, for most of the time, the work zone had a fair number of cars driving through it. There are very short intervals of time when the flow rate was less than 5 cars per minute. This is a good way for us to validate our sensor sampling rate. Even in dense traffic, our sensornet is able to produce good data. As we said earlier, we videotaped the traffic during this time, and compared it with the data collected from the sensornet. We found the data to be very well correlated with the video data. Vehicle trajectories. Using the trajectory mapping scheme described in Section 4.1, we calculate trajectories of the vehicles driving past our sensor array. During our deployment case study, we observed a majority of vehicles maintaining a steady path through the work zone. The average trajectory was about 4 feet from the side of the lane (Figure 9). The width of the lane is 12 feet, and the average car is about 6 feet wide. Given this, if cars were driving perfectly in the middle of a lane, then they would be 3 feet from either edge of the lane. The tendency of most drivers, when they see safety cones or other construction equipment, is to tend away from them, and favor driving closer to the opposite edge of the lane. This anecdotal tendency is confirmed in our case study instance, where the average trajectory is a foot further than the centerline of traffic in the lane away from the safety cones. However, a number of cars did veer off the average trajectory, and some came too close to the safety cones, and some others were driving too close to the opposite curb. Figure 9c shows the number of cars that veered too close to the safety cones measured at each of the sensor nodes. These instances are of interest to work zone designers: if there were an inordinate number of vehicles leaving the preferred trajectory at a single spot, that may indicate a potential unsafe situation. In our case, there is no such unusual observation, indicating that the traffic in this work zone was mostly compliant.

Sensornets for Measuring Traffic in Short-Term Work Zones

227

5 Discussion Throughout this research, during design and development, we constantly worked along with transportation engineers from the CSU UTC to make sure that we meet our design guidelines. Comparing with the list in Section 2, our system meets them quite nicely: 1. The nodes are programmed to collect traffic statistics, and the base-station is programmed to synthesize this data for publication on the internet. Deployment of the network simply entails mounting the nodes onto safety cones and turning them on. 2. Our prototype node is built from off-the-shelf parts, and as such, the cost of all parts in the node add up to about $220. We expect to cut this cost in about half with mass-fabrication. In comparison, most sensor equipment deployed in longterm work zones run thousands of dollars per node. 3. The only regular maintenance that is needed for our sensor nodes is to keep the battery charged. We have conveniently exposed the USB connector of the TelosB mote for this purpose. Simply plugging in the mote will charge up the battery. 4. Our network does not expect to be deployed in a pre-determined fashion. Instead, we built our self-organization logic around the practices that the work zones follow. Accordingly, we know that the nodes will be placed at uniform distances apart from each other, and this distance is provided to the network as a parameter. As an alternative to the infrared sensor, we are currently looking into other ranging sensors that are similarly inexpensive and easy to use. As of this writing, we are experimenting with the SRF02 ultrasonic range finder [6]. This sensor has a similar range as our Sharp infrared range finder (6 m). The sensor can connect to the TelosB mote through the I 2 C interface, and directly provides a distance reading (in cm) based the obstruction in front of the sensor. The sampling time of this sensor, however, is twice that of the IR sensor, which may cause timing issues with respect to capturing traffic: this sensor may miss some vehicles because of the reduced sampling rate. While we were working on writing this paper, we came across advice for successful sensornet deployments [2]. We were quite pleasantly surprised that we had already followed a number of the good practices [2] listed. For example, from the beginning, we have worked with domain specialists from transportation engineering to define the problem, and to expose the solution spaces; we have always trusted experimentation with real hardware as opposed to simulation (we used a smaller version of the IR sensor for lab tests with toy cars early on); our protocols are as simple as they can be, making system behavior very predictable.

6 Related Work In the past, most traffic monitoring systems were built with high-cost equipment such as measuring poles, inductive loops, etc. [25]. Moreover, these deployments are invasive, requiring activities such as road digging which require high labor. In our system sensor nodes can be deployed with minimal engineering efforts (at most placing them on either side of road or along a straight line on one side of the road). And compared to deployments involving inductive loops, sensornets come at a substantially lower cost.

228

M. Bathula et al.

Indeed others have used sensornets in the traffic monitoring context. In [5] and [16] authors have deployed sensors in the intersections of freeways and parking lots. [11] describes wireless magnetic sensors that can be used for traffic classification and surveillance. The sensors are designed to identify vehicles, speed of the vehicles, conditions of the road, density of the traffic etc. But the problem we deal with is, apart from monitoring the traffic, we are interested in the trajectories of the vehicles especially in work zones so that the traffic authorities can now learn about “near-crashes” which are impossible to find. There are also other mechanisms which directly deal with the driver rather than the vehicle [20]. These systems simulate traffic and may be subject to errors. In [29], Yoon et al. show how to estimate traffic on streets using GPS traces. In their system, cars are equipped with GPS receivers, and the traces of these GPS receivers is used to analyze traffic patterns. The Nericell [19] system is similar in that they use sensors in moving vehicles. As opposed to [29] and [14], however, their focus is on using sensors that people carry with them anyway. Their work is focused on using smart cell phones (which have a number of sensors such as GPS, microphone, accelerometer, etc.) to derive vehicle traces. By using this heterogeneous sample of traces, they can identify potholes on roads, distinguish traffic stopped at a red light from traffic stopped in a jam, etc. All these systems are complementary to our work, since they involve embedding sensors in moving vehicles. The California Department of Transportation [3] maintains a website with live feeds from a number of sensors across the state of California and a wealth of information in the form of studies and reports focused on monitoring traffic. The Ohio Department of Transportation (ODOT) maintains a similarly rich web-accessible system called Buckeye Traffic [22], and provides current information about road closures and restrictions on major highways because of construction projects, and identifies road activity from a variety of permanent sensors all over Ohio. However, no short-term work zones are captured; it is not feasible to have permanent sensors deployed in every street.

7 Conclusion Construction work zones on roadways are hazardous areas. Motorists driving through a roadway under construction may end up facing unexpected scenarios. Long-term work zones on major highways may present such unfamiliarity in the beginning, but once a motorist has driven on the modified road a few times, she can get used to the changes (which will last a few weeks, if not months). By contrast, short-term work zones —the kind that we see in our local city streets for utility work— are only active for a few hours at a time. This transient nature leaves them untraceable for the most part. In fact, there is very little empirical data available about traffic in short-term work zones. We have presented the design and prototype implementation of a sensornet system that is specifically targeted at collecting data about traffic in and around short-term work zones. Our system is rapidly deployable, easily maintainable, and is capable of capturing a variety of different statistics about vehicle traffic in work zones. The data collected can be used by transportation engineers to consider design parameters for future work zone configurations. We have tested our systems in live work zones in the Cleveland area, and are now in the process of working on expanding to wide deployment.

Sensornets for Measuring Traffic in Short-Term Work Zones

229

References 1. Awad, A., Frunzke, T., Dressler, F.: Adaptive distance estimation and localization in wsn using rssi measures. In: DSD 2007, pp. 471–478. IEEE, Los Alamitos (2007) 2. Barrenetxea, G., Ingelrest, F., Schaefer, G., Vetterli, M.: The hitchhiker’s guide to successful wireless sensor network deployments. In: SenSys 2008 (November 2008) 3. California Center for Innovative Transportation. Traffic surveillance, http://www.calccit.org/itsdecision/serv_and_tech/Traffic_ Surveillance/surveillance_overview.html 4. Chebrolu, K., Raman, B., Mishra, N., Valiveti, P.K., Kumar, R.: Brimon: a sensor network system for railway bridge monitoring. In: MobiSys 2008, pp. 2–14. ACM, New York (2008) 5. Cheung, S.-Y., Varaiya, P.: Traffic surveillance by wireless sensor networks: Final report. Technical Report UCB-ITS-PRR-2007-4, University of California, Berkeley (2007) 6. Devantech. SRF02 (2008), http://www.robot-electronics.co.uk/htm/srf02techI2C.htm 7. Doucet, A., Godsill, S., Andrieu, C.: On sequential monte carlo sampling methods for bayesian filtering. Statistics and Computing 10(3), 197–208 (2000) 8. Duffy, S.F.: Csu transportation center (2007), http://www.csuohio.edu/utc 9. Federal Highway Administration. Manual on Uniform Traffic Control Devices. U.S. Dept of Transportation, Washington D.C., 2003 edition with revisions 1 and 2 edition (December 2007) 10. Google. Google maps (2008), http://maps.google.com 11. Haoui, A., Kavaler, R., Varaiya, P.: Wireless magnetic sensors for traffic surveillance. Transportation Research Part C: Emerging Technologies 16(3), 294–306 (2008) 12. Herman, T., Zhang, C.: Stabilizing clock synchronization for wireless sensor networks. In: Datta, A.K., Gradinariu, M. (eds.) SSS 2006. LNCS, vol. 4280, pp. 335–349. Springer, Heidelberg (2006) 13. Holland, M.M., Aures, R.G., Heinzelman, W.B.: Experimental investigation of radio performance in wireless sensor networks. In: SECON 2006 (2006) 14. Hull, B., et al.: Cartel: a distributed mobile sensor computing system. In: SenSys 2006, pp. 125–138. ACM Press, New York (2006) 15. Kim, S., et al.: Flush: a reliable bulk transport protocol for multihop wireless networks. In: SenSys 2007, pp. 351–365. ACM Press, New York (2007) 16. Knaian, A.: A wireless sensor network for smart roadbeds and intelligent transportation systems. Technical report, MIT EECS and the MIT Media Laboratory (May 2000) 17. Levis, P.: Tinyos tep 111 (2008), http://www.tinyos.net/tinyos-2.1.0/doc/ html/tep111.html 18. Microsoft. Microsoft live maps (2008), http://maps.live.com 19. Mohan, P., Padmanabhan, V., Ramjee, R.: Nericell: Rich monitoring of road and traffic conditions using mobile smartphones. In: SenSys 2008 (November 2008) 20. Nadeem, T., Dashtinezhad, S., Liao, C.: Trafficview: A scalable traffic monitoring system. In: 2004 IEEE International Conference on Mobile Data Management, pp. 13–26 (2004) 21. Nath, S., Liu, J., Miller, J., Zhao, F., Santanche, A.: Sensormap: a web site for sensors worldwide. In: SenSys 2006, pp. 373–374. ACM Press, New York (2006) 22. Ohio Dept of Transportation. Buckeye traffic (2008), http://www. buckeye-traffic.org 23. Polastre, J.: Telos charger board (2005), http://www.tinyos.net/hardware/ telos/sensorboards/UCB_TSB+TCB_schematic-2005-07-31.pdf

230

M. Bathula et al.

24. Polastre, J., Szewczyk, R., Culler, D.: Telos: enabling ultra-low power wireless research. In: IPSN ’05, Piscataway, NJ, USA, IEEE Press, Los Alamitos (2005) 25. Santel, G.: Traffic flow and accident occurrence in construction zones on freeways. In: 6th Swiss Transport Research Conference (March 2006) 26. Sharp. GP2Y0A700K0F, http://www.acroname.com/robotics/parts/ R302-GP2Y0A700K0F.pdf 27. Singh, J., Madhow, U., Kumar, R., Suri, S., Cagley, R.: Tracking multiple targets using binary proximity sensors. In: IPSN 2007, pp. 529–538. ACM Press, New York (2007) 28. Srinivasan, K., Levis, P.: Rssi is under-appreciated. In: EmNets (2006) 29. Yoon, J., Noble, B., Liu, M.: Surface street traffic estimation. In: MobiSys 2007, pp. 220– 232. ACM Press, New York (2007)

Empirical Evaluation of Wireless Underground-to-Underground Communication in Wireless Underground Sensor Networks Agnelo R. Silva and Mehmet C. Vuran Department of Computer Science and Engineering, University of Nebraska-Lincoln Lincoln NE 68588, USA Tel.: (402) 472-5019; Fax: (402) 472-7767 {asilva,mcvuran}@cse.unl.edu

Abstract. Many applications for irrigation management and environment monitoring exploit buried sensors wired-connected to the soil surface for information retrieval. Wireless Underground Sensor Networks (WUSNs) is an emerging area of research that promises to provide communication capabilities to these sensors. To accomplish this, a reliable wireless underground communication channel is necessary, allowing the direct communication between the buried sensors without the help of an aboveground device. However, the signiﬁcantly high attenuation caused by soil is the main challenge for the feasibility of WUSNs. Recent theoretical results highlight the potential of smaller attenuation rates with the use of smaller radio frequencies. In this work, experimental measurements are presented at the frequency of 433MHz, which show a good agreement with the theoretical studies. We observe that (a) a decrease of the frequency of the wireless signal implies a smaller soil attenuation rate, (b) the wireless underground communication channel presents a high level of temporal stability, and (c) the volumetric water content (VWC) of the soil is the most important factor to adversely aﬀect the communication. The results show the potential feasibility of the WUSNs with the use of powerful RF transceivers at smaller frequencies (e.g., 300-500MHz band). We also propose a classiﬁcation for wireless underground communication, deﬁning and showing the diﬀerences between Subsoil and Topsoil WUSNs. To the best of our knowledge, this is the ﬁrst work that reports experiment results for underground to underground communication using commodity sensor motes.

1

Introduction

Wireless Underground Sensor Networks (WUSNs), which consist of wireless sensors buried underground, are a natural extension of the wireless sensor network phenomenon and have been considered as a potential ﬁeld that will enable a wide

This work is supported by the UNL Research Council Maude Hammond Fling Faculty Research Fellowship. The authors would like to thank Emily Casper and the UNL Landscaping Services staﬀ for their valuable helps during the experiments.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 231–244, 2009. c Springer-Verlag Berlin Heidelberg 2009

232

A.R. Silva and M.C. Vuran

variety of novel applications that were not possible before [1]. The realization of wireless underground communication and networking techniques will lead to potential applications in the ﬁelds of intelligent irrigation, border patrol, assisted navigation, sports field maintenance, intruder detection, and infrastructure monitoring. This is possible by exploiting real-time soil condition information from a network of underground sensors and enabling localized interaction with the soil. In this paper, we focus on a promising application, where WUSNs can be used to provide real-time soil condition information for intelligent irrigation and can help maintain ﬁelds more eﬃciently according to the soil quality. As a result, the cost for maintaining a crop ﬁeld can be signiﬁcantly reduced through autonomously operating underground sensors. Irrigation management is an underground application that has been deployed for more than 20 years [5]. In general, soil moisture sensors are buried at a 30120cm depth [5], [6] and wired to an extension above the surface, which can be used for (1) manual collection of data, e.g, a person with a datalogger moves from sensor to sensor to download the data or (2) connection with a micro-controller which is responsible for sending the readings to a datalogger node, via wireless channel. The collected data is then used to assess the irrigation requirements of the ﬁeld. The existing techniques, however, lack real-time information retrieval capabilities and are obtrusive for agriculture tasks on the ﬁeld. A wireless underground sensor network, however, has the potential to help to reduce water application to the agricultural ﬁelds through measurement of soil moisture status to make better informed irrigation application (timing) decisions without obstructing with the ﬁeld operations. Despite its potential advantages, the realization of WUSN is challenging and several open research problems exist. The main challenge is the realization of eﬃcient and reliable underground wireless communication between buried sensors. To this end, underground communication is one of the few ﬁelds where the environment has a signiﬁcant and direct impact on the communication performance. More speciﬁcally, the changes in temperature, weather, soil moisture, soil composition, and depth directly impact the connectivity and communication success in underground settings. Hence, characterization of the wireless underground channel is essential for the proliferation of communication protocols for WUSNs. In this paper, the results of ﬁeld experiments for underground communication at the frequency of 433MHz using commodity sensor nodes is presented. Moreover, lessons learned from these experiments for the proliferation of eﬃcient communication protocols for WUSNs are discussed. The results of the ﬁeld experiments show a good agreement with the theoretical result [9] and conﬁrms that the wireless underground channel (a) exhibits a two-path behavior at low burial depths, (b) presents a high degree of temporal stability compared to its air counterpart, and (c) is adversely aﬀected by the volumetric water content (VWC) of the soil. Finally, the results show the potential feasibility of the WUSNs, especially with the use of more powerful RF transceivers at smaller frequencies, e.g., 300-500MHz band.

Evaluation of Wireless Underground-to-Underground Communication

233

Fig. 1. Classiﬁcation of wireless underground communication networks (WUCNs)

The rest of this paper is organized as follows: In Section 2, an overview on wireless underground communication networks (WUCN) is provided along with a classiﬁcation of these networks and the related work. In Section 3, the testbed architecture for the experiments and the experimental methodology are described. The experiment results for the underground-to-underground communication are presented in Section 4. Finally, the lessons from the experiments and the future work is discussed in Section 5.

2

Background and Related Work

Wireless Underground Communication Networks (WUCNs) have been investigated in many context recently. Although a novel area, a detailed classiﬁcation of these networks is necessary since several diﬀerent scenarios, with very speciﬁc issues, are presented under the title wireless underground communication or, sometimes, WUSNs. In [1], two possible topologies for WUSNs are presented: the underground topology, where the majority of the nodes are buried, and the hybrid topology, where buried nodes coexist with some nodes deployed above ground. Based on this classiﬁcation, we provide a detailed classiﬁcation of WUCNs and present related work in this area. 2.1

Classiﬁcation of Wireless Underground Communication Networks

As shown in Fig. 1, WUCNs can be mainly classiﬁed into two: wireless communication networks for mines and tunnels and wireless underground sensor networks (WUSNs). Based on this initial classiﬁcation, it is important to note that there exist several solutions that focus on underground communication in mines and/or tunnels [4], [7], [10], [13]. In these work, although the network is located underground, the communication takes place through the air, i.e., through the underground voids. In this paper, however, we consider WUSNs, where sensor nodes are buried underground and communicate through soil.

234

A.R. Silva and M.C. Vuran

Although the sensors may be buried at diﬀerent regions of the soil, WUSNs can also be classiﬁed into two based on the burial depth of the sensors. The recent research on agriculture, environment monitoring, and security mainly focuses on the soil subsurface, which is deﬁned as the top few meters of the soil. Soil subsurface is classiﬁed into two regions [8]: (a) the topsoil region, which refers to the ﬁrst 30cm of soil, or the root growth layer, whichever is shallower and (b) the subsoil region, which refers to the region below the topsoil, i.e., usually the 30-100cm region. Accordingly, as shown in Fig. 1, Soil Subsurface WUSNs can be classiﬁed as a function of the deployment region: Topsoil WUSN, if the WUSN is deployed in the topsoil region, or Subsoil WUSN, if deployed in the subsoil region. Moreover, these networks are further classiﬁed as Hybrid WUSNs, which include nodes that are deployed above the ground and the communication is highly dependent on the existence of the aboveground nodes. For Topsoil and Subsoil WUSNs, the majority of the communication ﬂows in the undergroundto-aboveground direction. 2.2

Related Work

The concept of WUSNs and the challenges related to the underground wireless channel have been introduced in [1]. The characteristics of extreme path loss caused by the soil attenuation and the water content are also highlighted. However, this analysis is limited to the 1.4-3GHz RF range. In [9], [2], we develop a theoretical model for the wireless underground communication and a set of simulated results for the 300-900MHz RF range are provided. However, experimental results have not been provided. To date, very few WUSN experiments are found in the literature. Experiment results at the 2.4GHz frequency band are reported in [12], where the underground-to-underground communication is shown to be infeasible at this range. Instead, the results for underground-to-aboveground communication and vice-versa are provided. The burial depths for the presented experiments are 6cm and 13cm and a transmit power of 0dBm is used. Even with these small burial depths, the absence of results for underground-to-underground experiments points out the challenges of soil attenuation at the 2.4GHz band. In [16], experiment results at 2.4GHz band are also reported, where a burial depth of 9cm with a transmit power of +19dBm and a directional antenna with individual gain of 10dB is used. The experiments presented results related to only the underground-to-aboveground and aboveground-to-underground communication. Experiments at the 869MHz band are explained in [14], where underground to aboveground communication is considered. In this work, a buried transmitter and an aboveground directional antenna is used for the experiments. Inter-node distances of more than 30m are reported, where diﬀerent depths were used, i.e., 10-40cm. Also, diﬀerent VWC are tested for diﬀerent soil textures. The results from this work highlights that the use of a small frequency (869MHz) compared to 2.4 GHz can imply smaller soil attenuation and longer inter-node distances. The study focuses mainly on the metrics related to the eﬃciency of the customized directional antenna and the transmitter.

Evaluation of Wireless Underground-to-Underground Communication

235

(a) Outdoor environment of the exper- (b) Symbols used for distances in this dociments ument Fig. 2. Symbols used for distances. Outdoor environment for the experiments.

It can be observed that the experiments in [12,16,14,19] focus on the Topsoil WUSN scenario, where underground to aboveground communication is considered. Also, some recent commercial products for golf ﬁeld irrigation management [19], have been using a similar approach [16,14], where the node is buried very close to the surface, i.e., 5-15cm, a transmit power of ≥ 10dBm is used and underground to aboveground communication is considered. Despite the potential applications of the existing work, underground to aboveground communication is not applicable for irrigation management. First, large crop ﬁelds prevent the use of direct communication, which has limited range. Moreover, frequent activities such as plowing performed on these ﬁelds require non-obstructive approaches, where aboveground relays are not feasible. Furthermore, plowing and similar mechanical activities occur exactly at the topsoil region, i.e., 0-30cm, where the soil composition is continuously aﬀected. This requires higher burial depths in the root range of crops in the subsoil region, i.e., 40-100cm. These constraints call for subsoil WUSNs, where multi-hop communication is performed under the ground. To the best of our knowledge, however, underground to underground communication has not been evaluated through experiments before. In this work, we present the ﬁrst experimental results that focus mainly on Subsoil WUSNs and present guidelines for design of communication protocols for underground to underground communication. Certain results related to the Topsoil WUSNs are also presented.

3

Experiment Setup

The underground experiments were carried out in University of Nebraska-Lincoln City Campus on a ﬁeld provided by the UNL Landscaping Services during August-November 2008 period. The analysis of the soil texture of the experiment site is shown in Table 1 according to laboratory analysis [20]. For the experiments, MICA2 nodes that operate at 433MHz are used [18]. This frequency

236

A.R. Silva and M.C. Vuran

Table 1. Soil Analysis Report Sample Depth Organic Matter Texture %Sand %Silt %Clay 0-15cm 15-30cm 30-45cm

6.4 2.6 1.5

Loam Clay Loam Clay Loam

27 31 35

45 40 35

28 29 30

range has been theoretically shown to exhibit better propagation characteristics in [9]. The underground experiments were performed by digging 10 holes of 8 cm-diameter with depths varying from 70 to 100cm with an auger. A paper pipe with an attached Mica2 node is injected to each hole at diﬀerent depths. The experiment site is shown in Fig. 2(a). For the experiments, a software suite is developed to perform long duration experiments without frequent access to the underground motes. The software suite enables carrying out several experiments with various parameters without reprogramming the motes, which is a major challenge for underground settings. A Java/TinyOS 1.1x application, called S-GriT (Small Grid Testbed for WSN Experiments), is developed to allow many number of the nodes acting as receivers. The S-GriT allows conﬁguration of multiple experiments with the following parameters: transmit power level, number of messages for the experiment, number of bytes per message, and delay between the transmission of each message. The nodes assume one of the three roles in the S-GriT application: (1) Manager is used by the operator to conﬁgure and start the experiments and also to retrieve the results from the receivers; (2) Sender, which is buried underground, receives conﬁguration information from the Manager, via wireless channel; and (3) Receiver receives the test messages from the sender and prepares a summary containing the sequential number of each received message and the Received Signal Strength Indication (RSSI) level related to it, which is a measurement informed by the transceiver of the node and expresses the Received Signal Strength (RSS) of the signal. Consequently, the testbed experiments also stand as a proof-ofconcept for underground data retrieval using commodity sensor nodes. The experiment setup and the terminology used in representing the results are illustrated in Fig. 2(b), where dbg is the burial depth of the node, dh is the horizontal inter-node distance, and da is the actual inter-node distance. The superscripts s and r are also used to indicate sender and receiver. These values, as well as the transmit power, are varied to investigate the PER and RSS values of underground communication. The experiments are conducted for four values of transmit power, i.e., -3dBm, 0dBm, +5dBm, and +10dBm. 30-byte packets are used with 100ms between each packet. Each experiment in this work is based on a set of 3 experiments with 350 messages or 2 experiments with 500 messages, which result in a total of 1000 packets. The number of packets correctly received by one or more receiver nodes are recorded along with the signal strength for each packet. Accordingly, the packet error rate (PER) and the RSS level from each receiver are

Evaluation of Wireless Underground-to-Underground Communication

237

collected. To prevent the eﬀects of hardware failures of each individual Mica2 nodes, qualification tests have been performed before each experiment. Accordingly, through-the-air tests, which consists of 200 packets of 30 bytes, are performed to (1) determine compliant nodes and (2) conﬁrm that the battery level of a node is above a safe limit. A node is labeled compliant with a given set of nodes if (1) its PER varies within 10% of the average PER calculated for the set of nodes and (2) its RSS average varies, at maximum, +/- 1 dBm from the average RSS for the set of nodes. The safe limit for the battery level has been determined as 2.5V. We observed that, in general, only 50% of the 11 nodes used were qualiﬁed for each experiment.

4

Experiment Results

The results are presented considering how some important parameters aﬀect the wireless underground communication: the antenna orientation, the burial depth, the inter-node distance, and the soil moisture. Moreover, the temporal characteristics of the wireless underground communication channel are discussed. 4.1

Antenna Orientation

Antenna orientation experiments were performed by placing a sender and a receiver at diﬀerent angles as shown in Fig. 3(a) to provide guidelines for node deployment. The antenna of MICA2 is a standard one-quarter wavelength monopole antenna with 17cm-length, whose radiation pattern does not exhibit a perfect sphere and matches the dipole antenna model presented in [11]. The experiments were performed at a depth of dbg = 40cm and at a distance of da = dh = 100cm between the sender and the receiver. In Fig. 3(b), the packet error rate (PER) 1 0.9

Packet Error Rate

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

30

60

90

120 150 180 210 240 270 300 330 360

Relative Angle of the receiver’s antenna

(a) Relative angles for the antenna.

(b) PER vs. relative angle for the antenna.

Fig. 3. The schema used to test the eﬀects of the antenna orientation for the underground-to-underground communication

238

A.R. Silva and M.C. Vuran 1 0.9

−60

0.8

Packet Error Rate

Received Signal Strength (dBm)

−50

−70

−80

−90

−100

TX Power +10dBm TX Power +5dBm

TX Power +10dBm TX Power +5dBm TX Power 0dBm

0.7

TX Power −3dBm

0.6 0.5 0.4 0.3 0.2

TX Power 0dBm

0.1

TX Power −3dBm

−110 0 [64] 10 [58] 20 [54] 30 [51] 40 [50] 50 [51] 60 [54] 70 [58] 80 [64]

r bg

Depth of the receiver d

and [d ] (cm) a

0 0[64] 10[58] 20[54] 30[51] 40[50] 50[51] 60[54] 70[58] 80[64] r Depth of the receiver d and [d ] (cm) bg a

(a) Received Signal Strength vs. depth of (b) Packet Error Rate vs. depth of the the receiver (drbg ) and actual inter-nodes receiver (drbg ) and actual inter-nodes disdistance da . tance da . Fig. 4. Eﬀect of the reﬂected path from the ground surface. Sender buried with dsbg =40cm. Horizontal inter-node distance dh =50cm. The depth of the receiver is varying from 0 to 80cm.

is shown as a function of the node orientation. It can be observed that when the relative angle varies from 90o to 340o , the PER increases and the orientation of a node has a signiﬁcant impact on the communication success. If the antenna orientation is between 120o and 300o, virtually no communication is possible. Hence, during remaining experiments, only the 0o orientation is used to eliminate the eﬀect of antenna orientation. This result shows that the antenna orientation is an additional constraint to be considered for deployment of WUSNs, compared to traditional WSNs, especially for multi-hop underground networks, where communication range varies based on the antenna orientation. 4.2

Eﬀects of Burial Depth

In this section, we discuss the eﬀects of burial depth on the signal strength and PER. Accordingly, the horizontal inter-node distance between the sender and the receiver is ﬁxed (dh =50cm), the burial depth of the sender is also ﬁxed (dsbg =40cm) and the depth of the receiver is varied from 10 to 100cm using diﬀerent transmit power levels. In Fig. 4(a) and 4(b), the RSS and PER values are shown, respectively, as a function of the receiver depth. The actual distance, da , between the sender and the receiver is also indicated in parenthesis on the x-axis. Each line in the ﬁgures shows the results for diﬀerent transmit power levels. In Fig. 4(a), the variance of the RSS is also shown along with the average values for each point. As shown in Fig.4(a), an increase in the actual inter-node distance, da , decreases the signal strength, as expected. The highest signal strength corresponds to the receiver depth of 30-40cm and the signal strength gradually decreases if the receiver burial depth is smaller than 30cm or higher than 40cm. One exception to this case is drbg = 0cm, where the signal rays from above the ground impact

Evaluation of Wireless Underground-to-Underground Communication

239

the received signal strength positively and increase the RSS for each transmit power level. An important observation is the signiﬁcant diﬀerence of RSS values at the same inter-node distances but at diﬀerent burial depths. As an example, an additional attenuation of 20dB is observed for the same inter-node distance of da =58cm, when the receiver is buried at 70cm compared to the burial depth of 10cm. This behavior occurs mainly due to the reﬂection of RF signals from the soil surface, which positively aﬀects the RSS when nodes are buried closer to the surface. This result validates the two-path channel model for the wireless underground channel proposed in [9,2]. It can be observed in Fig.4(b), that for the receiver burial depth of 70cm, the PER increases (0.1. Fairness. If the goal is only to minimize the sum of disk radii, as a proxy for total sensor energy or cost, an optimal solution might assign a small radius to one sensor and a large radius to another. Equivalently, one disk could get wiggle radius 1 and another 0, 1

Note that a wiggle radius of 1 is the extreme case in which the sensor can be placed anywhere.

Cheap or Flexible Sensor Coverage

247

Table 1. Summary of problem settings, many of which interrelate. In some settings, hardness and easiness depend on the strength of general-position or λ-precision [10] assumptions used. D ISC 1 D

D ISC 2 D

C TN 1 D C TN 2 D

S UM R AD nm in P [6] FPTAS FPTAS S UM W IG nm + n2 NP-hard/PTAS NP-hard FAIR R AD (nm)2 NP-hard n log n in P FAIRW IG (nm)2 NP-hard n log n in P

which is clearly unacceptable given the motivation of wiggle room. (Similar motivations would apply if the radii in the sum of radii (S UM R AD) problems were interpreted as individual costs that must be contributed to an agreed-upon joint project.) One way to partially alleviate this problem would be to assume larger disks. We assume throughout the paper that with zero wiggle room assigned, the unit disks will together form a feasible problem solution. We might make a stronger assumption that even if all disks are assigned wiggle room of, say, .2, full coverage would still be provided. We could then seek to maximize the allotment of the remaining wiggle room, taking .8 as the radius bound. (A similar assumption is made for continuous summation (C TN S UM) in Section 3.3.) We also consider fairness as a direct goal. A natural objective in this vein is maximizing the minimum wiggle radius. A generalization of this idea from networking is max-min fairness (MMF) [2]. For a maximization problem, an assignment of values is MMF if raising any value xi would necessarily decrease another value xj such that xj < xi , or equivalently, if the vector of assigned values (xi ), ordered by increasing value, is lexicographically maximal. (In Fig. 2, the latter solution is MMF.) α Minimizing the sum of disk radii raised to a superlinear power, ri for α > 1, rather than minimizing the simple sum of radii, goes some way towards encouraging fairness. A larger α will more strongly encourage fairness, since e.g. this make one large radius more expensive than two small ones, but it is easy to construct examples for any fixed α that result in unfair assignment. The minimization problem was first shown to be NP-hard for α ≥ 2 [3], and later for α > 1 [1]. The discrete MMF problem, which can be seen here as the limiting case of this problem as α → ∞, i.e., the ∞ norm, can be shown to be NP-hard by modification of the hardness argument in [1]. Explanation of Table 1 and Problem Names. Combining the choices of 1-d v. 2d, wiggle maximization v. radius minimization, summation v. fairness, and discrete v. continuous we obtain 16 problem settings (see Table 1). We refer to a given problem setting by an abbreviation such as D ISC 1 D S UM R AD or C TN 2 D FAIRW IG. Many of the problem solutions are related or identical. An abbreviation such as 2 D FAIR omitting some of the four properties is understood to indicate all compatible problem subcases, ranging over all choices for the omitted attributes. All present entries in Table 1, apart from D ISC 2 D S UM R AD indicate our results. Note that there are several holes in the table. For some settings, we know only hardness or approximation schemes but not both. We give no results for C TN 1 D S UM W IG.

248

A. Bar-Noy et al.

Summary of results and techniques. We relate the placement problem for unit disks with fixed uniform wiggle radii for guaranteed coverage to coverage solutions for boolean disks. Given a set of disk sensors, we show for the 2-d setting how to obtain FPTASs for the discrete wiggle radius problem and the continuous sensor radius problem. We also show that the discrete wiggle radius problem is NP-hard. In the 1-d setting, we observe that the wiggle radius problem can be solved optimally by known algorithms and we give a faster dynamic programming algorithm for the (linear) sensor radius problem. Finally, we consider max-min fairness in the assignment of sensor or wiggle radii in 1-d and 2-d. Although the discrete problem is NP-hard, we give an exact polynomial-time algorithm for the continuous problem, assuming the sensors are in general position. In some cases, the inexact placement problems reduce easily to known problems, while in others the nature of the problem changes significantly. We study these problems in both 1-d (on the line) and 2-d (in the plane). In the continuous setting, the object is to cover a continuous, convex region; in the discrete setting, there is a given finite set of points (clients) to cover. The problem instance consists of the locations of n sensors and (in the discrete case) m clients. Several algorithms for the discrete setting (both 1-d and 2-d) involve “pinned” disks. A pinned disk is centered on sensor i, with r so that the disk just touches some client j, i.e., we have ri = d(i, j), where d(i, j) is the separating distance. It is well known that any optimal solution to the min-radius problem will consist of pinned disks. The same applies to the max-wiggle problem, though not all possible pinned disks will exist.

2 Related Work There have been efforts in various communities, e.g. in computational geometry, approximation algorithms, sensor networks, to characterize, find approximate solutions to, and analyze the intrinsic properties of, coverage problems involved binary sensors. We briefly refer to some of this work. Covering a large convex region (ignore edges, or assume the region of interest is the convex hull of the sensor locations) with a minimal number of disks of radius R, is solved optimally by a hexagonal lattice arrangement of disks (see Figure 1.a [12,16]). [11] discusses the more general concept of a sensor’s wiggle region, i.e., the point set of locations for the disk center that are consistent with the global coverage guarantee (see the shaded areas of Figure 1.b), assuming other sensors are constrained by their own wiggle regions, without leaving any point uncovered. In lattice arrangements, wiggle regions will be approximately disk-shaped. Moreover, the motivation for wiggle room is bounded placement error, which is likely to be non-directional or isotropic. We therefore limit ourselves in this paper to wiggle disks. For the problem of positioning binary unit-disk sensors to cover a large continuous field, it is well known that a hexagon grid configuration is optimal. Recently there has been some work [7] on positioning sensors whose coverage is probabilistic, based on distance. A related but significantly different problem is to decide which, among a set of positioned sensors, to turn on to cover the region. Hochbaum & Maass [8] gave a well known PTAS for several disk packing and covering problems in the plane, based on

Cheap or Flexible Sensor Coverage

249

“grid-shifting”. This technique was extended to Maximum Independent Set and Vertex Cover on non-unit disk graphs by Erlebach et al. [5], in a recursive dynamic programming procedure that uses a multiple grids varying in granularity, corresponding to disks of different sizes. The Erlebach et al. technique was used by [14] and [3] for the discrete sensor radius minimization problem, in which the goal is to cover a discrete set of clients in the plane. Optimal dynamic programming algorithms for the 1-d setting have been given by [14,1,13]. [14] gives a dynamic programming (DP) algorithm to solve the problem (optimally for the generalization of α ≥ 1) optimally in O((n + m)3 ) time, where m = #clients and n = #sensors. Lenchner [13] recently gave a faster O(m2 + nm) DP. Several other related problems have been considered; see [1] and references therein. The NP-hard Facility Location problem [15] is similar, with the difference that one pays for the coverage of each client, based on the distance to the covering server, rather than only once for the sensor radius. More closely related are clustering problems such as k-Median, in which the radii of k clusters, together containing all points, are minimized. It was recently shown [6] that the 2-d discrete sensor radius assignment problem (and related clustering problems) can be solved in polynomial time.

3 Optimizing the Sums of Wiggle and Sensor Radii (S UM) 3.1 Optimizing Radii Sums in Discrete 1-d (D ISC 1 D S UM) In 1-d, sensors and clients lie on the line, with m = #clients and n = #sensors. The Lenchner algorithm [13], which runs in O(m2 + nm), can be modified to observe the radius bound, by removing larger disks from the set of pinned disks considered, so it can be applied to D ISC 1 D S UM W IG as well. For D ISC 1 D S UM R AD, we next give an optimal DP algorithm that runs in O(nm) time. We first define some notation: set S contains the indices of the sensor nodes and set C contains those of the client nodes; vi is the ith node (either sensor or client); di,j is the distance between nodes i and j; Cj,i is the pinned circle centered on sensor j with radius di,j ; r(i) is the radius assigned to node i, which is necessarily 0 if vi is a client; for a client i and a sensor j with j < i, f (i, j) is one less than the index of the leftmost client contained in circle Ci,j . The DP forward procedure (see Figure 2) fills two arrays, c and cˆ, of solution costs. (For simplicity, we explain only how to compute the optimal cost, omitting the computation of the optimal solution.) c(i) is the optimal solution cost for the problem instance consisting of the first i nodes; cˆ(i) is the optimal cost for the instance consisting of the first i−1 nodes and the ith node replaced with a sensor (which we call a pseudo-sensor). (ˆ c(i) is redundant whenever node i is itself a sensor, but we include it here for notational convenience.) The DP is based on this observation: if the only remaining clients to cover lie to the left of a circle with radius r1 centered on v1 , then increasing v1 ’s radius from r1 to r1 + r2 is equivalent, in both cost and coverage benefits, to adding a new sensor with radius r2 , positioned at a distance r1 to the left of v1 . Intuitively, the DP works as follows. Given a new sensor vi , either it is ignored or it is given radius di,i−1 and node vi−1 is replaced with a sensor if necessary. Given a new client, it will be covered by some prior sensor vj . Either that sensor will be given radius dj,i , or it will be given a

250

A. Bar-Noy et al.

Algorithm 1. DP Algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:

for each pinned circle Cj,i do L(j, i) ← the index of the leftmost client contained in Cj,i f (j, i) ← L(j, i) − 1 end for c(0) ← 0 cˆ(0) ← 0 for i = 1 to m + n do cˆ(i) = min(c(i − 1), cˆ(i − 1) + di,i−1 ) if vi ∈ S then c(i) = cˆ(i) else c(i) = min {dj,i + c(f (j, i)), dj,f (j,i) + cˆ(f (j, i))} j dij to the left of sensor j (i.e., missed by the circle C). There is no benefit to expanding C to touch a previous sensor, and anyway it could actually happen that the node prior to L(j, i) is a sensor inside C, but there is no harm in evaluating this choice. The important strategy we want to try is expanding the circle to reach the most recent node, since if it is a client, covering it with the new circle may yield a cheaper solution. In this case the solution cost is dk,j + c (j). Lemma 1. The DP forward procedure runs in O(nm) time. Proof. It is known that the leftmost clients of all nm pinned circles can be found in time O(nm) [13]. To obtain the bound, we observe that in handling a new node, two subcases are considered if the new node is a sensor, and 2m subcases are considered if it is a client. From the discussion above, considering each such case can be done in O(1). We now prove correctness.

Cheap or Flexible Sensor Coverage

251

Theorem 1. The DP forward procedure computes the optimal solution value. Proof. (sketch) By induction. In the base case of zero nodes, the optimal solution (zero radius) is trivially obtained. Now assume the optimal solution is obtained for all problem instances of length at most k. Then consider one of length k + 1. Suppose vk+1 is a sensor. In the optimal solution, r(k + 1) must be either 0 or ≥ dk,k+1 . If the latter case, the solution cost would be the same if an inverse-shift were performed, i.e., r(k) = r(k + 1) − dk,k+1 , r(k + 1) = dk,k+1 . cˆ(k)’s optimality then implies c(k + 1)’s. On the other hand, suppose vk+1 is a client. In an optimal solution, vk+1 is covered by some sensor vj , with radius either dj,k+1 or larger. In the latter case, the solution cost would again be unchanged by an inverse-shift, and so optimality again follows. Unfortunately, the DP algorithm above does not apply to D ISC 1 D S UM W IG, because of the bound on allowable sensor radii. The algorithm could be modified so that, in handling the case of a new client, the previous client is only considered if it is near enough. The difficulty lies in the handling of the pseudo-sensors. The handling of a new client reduces to an indeterminate number of pairs of precomputed subcases, half of which involve a pseudo-sensor. When the subcase ending at such a pseudo-sensor is solved, there is no way to know which latter sensor(s) it will be treated as an extension of, and therefore no way to know what bound on its radius to use. 3.2 Optimizing Sums in Discrete 2-d (D ISC 2 D S UM) Since D ISC 2 D S UM R AD is known to be solvable, we turn to the wiggle radii problem, of maximizing i 1 − ri , subject to ri ∈ [0, 1]. The superlinear version of the corresponding sensor radii problem is known to be NP-hard; a hardness result for the (linear) wiggle radii problem is implicit in the superlinear hardness proof of Alt et al. [1]. Proposition 1. D ISC 2 D S UM W IG and bounded D ISC 2 D S UM R AD are NP-hard. Proof. Alt et al. [1] reduce from P LANAR 3SAT to D ISC 2 D S UM R AD for α > 1, by constructing a sensor/client graph with an alternating chain of disks for each 3SAT variable, and a gadget for each 3SAT clause. There are two important steps in the proof. First, the chains are drawn finely enough (based on the value α) that the optimal solution will choose every other disk in each chain (“even” disks or “odd” disks), rather than larger disks subsuming more than one of the “chain” disks. Second, for the same reason, one of three smaller circles will be chosen in each gadget, to cover a client at its center, rather than enlarging one of three chain disks meeting at the gadget. We can obtain these two properties by placing the disk upper bound to be precisely the radius of the chain disks. A PTAS for maximizing wiggle room does not immediately follow from the existence of a PTAS for sensor radius minimization. A natural strategy is to run the sensor PTAS to obtain a set of sensor radii and then to assign the wiggle radii wi = 1 − ri . But nothing prevents the PTAS from assigning a radius greater than one, which absurdly implies a negative wiggle radius. Moreover, an approximation guarantee for one problem does not guarantee an approximation bound for the other. Suppose e.g. = 1/2,

252

A. Bar-Noy et al.

OP Tsens = 1/2, and ALGsens = 1. Then OP Twig = 1/2 and ALGwig = 0. Nonetheless, by adapting the sensor PTAS, a wiggle PTAS can be obtained. The polynomial running time of the Lev-Tov & Peleg DP is obtained by bounding the number of “semi-disjoint” disks (neither containing the other’s center) of roughly equivalent size will fit in or overlap a grid cell. Intuitively, the algorithm can limit itself to semi-disjoint disks because any solution with two non-semi-disjoint disks could be improved by increasing the size of one disk and removing the other. Of course, with a bound on allowable disk size, this is not always possible. If we reasonably assume that the graph is a λ-precision graph [10], for some fixed λ > 0, then a PTAS is obtainable. Proposition 2. For graphs of sensors and clients drawn at λ-precision for some fixed λ > 0, there exists a PTAS for D ISC 2 D S UM W IG. Proof. Due to λ-precision, the number of nodes within a fixed-size cell will be bounded by a constant, and so the grid-shifting PTAS technique directly applies. 3.3 Optimizing Sums in Continuous 1-d and 2-d (C TN S UM) In this section, we consider the problem setting in which sensors are discrete and must be assigned radii (without a bound) to cover a continuous region. Ignoring edges, this region is assumed to be the convex hull of the sensor points in 2-d. In 1-d, this is the interval from the leftmost sensor to the rightmost. In 1-d there exist very simple approximation algorithms. Let the length L of the interval to be covered be normalized to L = 1. Then the optimal solution cost must lie between .5 and 1. Therefore assigning radius 1 to the leftmost sensor immediately yields a constant-time 2-approximation. One ideal situation is that there exists a sensor exactly in the middle of the interval. Assuming sensor locations are given as a sorted list, this suggests an O(log n)-time algorithm (Centermost): find the sensor nearest to the interval center, and assign it a radius large enough to cover both ends. Proposition 3. Centermost is a 1.5-approximation algorithm. Proof. If there is a sensor in the range [.25,.75], then ALG pays at most .75 while OPT pays at least .5. If not, then ALG pays at most 1 while OPT pays at least .75. For discrete sensors, optimal solutions can be found for 1-d and near-optimal solutions can be found for 2-d. By reducing to the discrete setting, near-optimal solutions can be efficiently found for both 1-d and 2-d continuous settings. Proposition 4. For λ-precision instances, C TN 1 D M IN R AD is solvable in polynomial time. Proof. (sketch) We first claim that in any problem instance, there exists an optimal solution with no overlap between any two sensors’ assigned disks. Given an arbitrary solution, overlaps can be eliminated one by one, by shrinking the second disk and then increasing the following disk, without changing the total cost of the solution. Now, suppose all sensors lie at locations specified at λ-bit precision (for constant λ), or equivalently at integral values in a scaled range. Since there is an optimal solution having no overlap, in such a solution all sensor radii are λ-bit values. Now we place

Cheap or Flexible Sensor Coverage

253

discrete sensors at each multiple of 2−λ−1 . Known algorithms for D ISC 1 D S UM R AD, slightly modified so that only b-precision disks, as opposed to λ + 1-precision, but including all clients, will now solve this problem optimally. Proposition 5. There exists an FPTAS for C TN 1 D S UM R AD. Proof. We want to find a solution with cost at most (1 + )·OP T . We know the optimal solution will be at least 1/2. Therefore we place equally spaced discrete clients along the interval so that the separating distance is 2 = min(/(2n), λ), and we solve this discrete problem optimally in poly time. We now increase all sensor radii by amount 2 , which will fill in any coverage gaps, at a cost of at most · 1/2 ≤ · OP T . Note that the dependency on is quite reasonable. In the case of = 1/n, for example, the total running time is only O(n3 ). Proposition 6. There exists an FPTAS for C TN 2 D S UM R AD. Proof. We construct an instance of the discrete problem, which is in P. The continuous problem instance consists of a set of sensor locations and a convex coverage region, which we assume is specified by a set of points. Given this, we produce an instance of the discrete problem. It has the same sensors as does the continuous instance; its client locations form a mesh (square, say) over the coverage region. First, let D = maxs1 =s2 ∈S d(s1 , s2 ), which can be thought of as the diameter of the region. Notice that the optimal solution value will be at least D/2 and at most 4D. Let be the (multiplicative) error parameter to the continuous PTAS. Let d be the separating distance of the mesh, i.e., the distance between a client and its neighbors. Then we draw a mesh of clients fine enough so that d < min(D/m, λ), where m is the number of sensors. Then the number of clients is Θ((D/d)2 ) = Θ((m/)2 ), which is polynomial in m. We now run the optimal algorithm on the discrete problem instance. The resulting solution covers all clients, but it may leave uncovered some small region between them. Since any such region is small enough to fit within a d × d square, we modify the resulting solution by increasing the radii of all sensors by amount d. The resulting solution will cover the entire region. Then since md ≤ D ≤ · OP T , we have that ALG ≤ OP Tdis + md ≤ (1 + )OP T Therefore the algorithm achieves the required approximation guarantee. The subroutine runs in time bounded by some polynomial p(nm). The running time of the continuous FPTAS will be bounded by p(m(m/)2 ), which indeed is polynomial in m and 1/. Because the objective is minimization, this reduction does not apply to C TN S UM W IG.

4 Fairness in Wiggle and Sensor Radii (FAIR) The second difference between S UM R AD and S UM W IG is that the latter implicitly has a bound on the sensor ranges, meaning that optimal solutions for the two problems could diverge, since e.g. an optimal solution for the former could consist of a single huge

254

A. Bar-Noy et al.

Algorithm 2. MMF Algorithm 1: insert into the heap the initial coverage cost r(p) of each subregion 2: while the heap is not empty do 3: perform a delete-min, removing subregion p 4: assign r(p) to one or more of p’s sensors 5: update the coverage costs of other subregions as needed 6: end while

disk. This difference is irrelevant to FAIR, since first of all the largest assigned radius should be as small as possible. Therefore for each sub-setting FAIR R AD and FAIRW IG are really the same problem, and so we do not distinguish between such pairs. 4.1 Fairness with Discrete Clients (D ISC FAIR) One of the simplest algorithms for D ISC 1 D S UM R AD is an O(nm2 ) algorithm that, for each increasing prefix of j client, finds the cheapest solution to cover them by considering each of the O(nm) pinned disks covering the ith client, combined with the optimal solution (already computed) for the clients lying to the left of that disk. This algorithm can be adapted to the metric of fairness at the cost of one order of magnitude, since comparing two vectors of radii can be done in m time. Thus we have: Proposition 7. D ISC 1 D FAIR is in P. Proposition 8. Without restricting assumptions, D ISC 2 D FAIR is NP-hard. Proof. We obtain the result by examining the hardness result of Alt et al. [1]. In that reduction, a sensor/client graph is constructed in which a minimum-cost solution will choose half the “chain” disks of each chain and at least one of the three small disks of each gadget. Such a solution will in fact be max-min fair, since removing or shrinking any of the chosen chain disks or small gadgets would necessitate the inclusion of a still larger disk. With a strong enough general-position assumption, e.g. if random noise is applied to all sensor and client positions, the sort of MMF techniques used for C TN FAIR can also be made to apply to D ISC FAIR. We omit the details in this extended abstract. 4.2 Fairness with a Continuous Region (C TN FAIR) In the continuous setting, however, a MMF set of radii for n sensors can be found in polynomial time, both in 1-d and in 2-d (in the latter case if the sensors are in general position, i.e. no set of more than three points lie on a circle; or that the resulting Voronoi diagram is nondegenerate). The algorithm for the two settings is based on the same schema. As above, we assume that the configuration of sensors is feasible in the sense that given zero wiggle room, full coverage will be achieved. Given this, the optimal max-min fair sensor assignment and min-max fair wiggle assignment solutions will be identical. Therefore we do not distinguish between the two problems in the following. The algorithm schema is shown above.

Cheap or Flexible Sensor Coverage

255

Let the region of interest be partitioned in some way into subregions, each of which is defined by a set of bordering sensors. For each not-fully-covered subregion, let its current cost be the maximum radius value required to cover a currently uncovered point within it, given the previously assigned radii. Let the critical region be the hardest such remaining uncovered point within the entire region. Then we must show three things: that the critical region can be found in poly time; that the radii to cover it can be chosen in poly time; and that if there are ties for critical region then the ordering of tie-breaking does not matter. In this case, the resulting radii assignment will be fair. Theorem 2. If the ordering of tie-breaking does not matter, then MMF is achieved. Proof. We prove by induction, over the length of prefixes of the subregion list. For the base case, it is clear that the first step will minimize the maximum radius. Let Ri be the subregion (some of) whose sensor(s) are assigned radii in round i. For Ri , reached after assigning radii (to ni sensors) that form a prefix of a MMF vector, we must show that the radii assigned in round i also satisfy MMF. If there are no ties, or if the ordering of tie-breaking does not matter, then this immediately follows. Since we consider all remaining subregions, the new values added to the MMF vector are minimal, and the larger vector is again MMF. Fairness in 1-d (C TN 1 D FAIR). In 1-d, the interval is partitioned by the sensors into subintervals, each of which is defined by two sensors. The cost c(p) of each subinterval p is set to half its width. These values are placed into the heap in O(n log n). Clearly the first subinterval pˆ extracted from the heap is the hardest to cover. In an MMF radius assignment, both of p’s sensors must be given radius pˆ; if either was less the other would have to be greater. Of course, it is possible for either of these radius assignments to cover other subintervals partly or entirely. Consider p’s right sensor. It will completely cover 0 or more subintervals and then partly cover 0 or 1 additional subinterval. Searching the array of sensor locations, the last subinterval it (partly) covers can be found in time O(log n). All fully covered subintervals (with cost 0) can be removed from the heap. If the (possible) partly covered subinterval is more than half covered, then the difficulty of it in the heap will be reduced to the size of the portion left uncovered (and when chosen, only its right sensor will be assigned); otherwise, its difficulty will be unchanged. Clearly assigning radius pˆ to all sensors will minimize the maximum. Instead, we repeatedly search for the uncovered interval p requiring the next largest radius. Since the difficulty of the remaining subintervals is updated after each selection, the selection itself is simply a delete-max operation on the heap, for a total of O(n log n). If there are no ties, then each move is forced, and it is clear that the result is MMF. Suppose that at some point there are two subintervals p and q that tie as most difficult, i.e., with r = r(p) = r(q). Then we claim the order in which they are chosen does not matter. Suppose interval p lies before q, separated by some distance d ≥ 0, and suppose p is chosen first. If d = 0, there are three sensors involved, say 1,2,3. (Assume wlog that none has yet been assigned a radius.) When p is chosen, 1 and 2 are assigned r; when q is later chosen, 3 is assigned r. If d > 0, there are four sensors involved, say 1,2,3,4. When p is chosen, 1 and 2 are assigned r; when q is later chosen, 3 and 4 are both assigned r because sensor 2’s coverage will not reach the center of interval q.

256

A. Bar-Noy et al.

Fairness in 2-d (C TN 2 D FAIR). We follow the same algorithm schema in 2-d, but implemented differently. In place of subintervals, we work with Delaunay triangles. Whereas in 1-d newly assigned radii meet at a subinterval’s center, in 2-d they meet at the Voronoi point at the triangle’s center. For each Voronoi vertex p (equidistant between three sensors), its cost c(p) is the length of its Voronoi edges. Among all Voronoi vertices, take the point pˆ of maximum cost c(ˆ p). Then, because decreasing any any of pˆ’s three radii would force another to increase, we have the following: Lemma 2. In a MMF radius assignment covering a neighborhood around pˆ, all three neighboring sensors must be given radius c(ˆ p). Once again, assigning value c(ˆ p) to all sensors suffices to minimize the maximum radius. For a not-fully-covered Delaunay triangle, its current cost is the maximum radius value required to cover a currently uncovered point within it, given the previously assigned radii. We repeatedly search for the next critical region. Therefore it suffices, in each round, to determine each Delaunay triangle’s cost, which is much more complicated than in 1-d. There are several issues to consider. First, one or two of the triangle’s sensors may already have been assigned a radius. Second, part or all of the triangle may already be covered by sensors external to it. Third, there could even be more than one continuous region of the triangle uncovered. We consider each case in turn. Lemma 3. If no external disks intersect with the triangle, its current cost can be found in poly time. Proof. If none of q’s three sensors has been assigned a radius, it is handled the same way as the first triangle, by assigning c(q) to the three. If at least one radius has been assigned, then the other(s) are given appropriately smaller radii so that q’s Delaunay triangle is covered. If exactly one of the three sensors (say, A) has already been assigned a radius, then (identical) the radii values given to the other two (say, sensors B and C) will be chosen so that the three sensors’ disks meet at one point, which will lie upon the Voronoi edge separating the regions of B and C. (This point can be found by elementary geometry.) If two of the three have been assigned (say, A and B), the third is chosen so that it intersects with the closer of the two intersection points of A and B. Lemma 4. If there is only one uncovered region in the triangle, its current cost can be found in poly time. Proof. Now assume the triangle is (partly) covered by other sensors, i.e., not those located at the triangle’s vertices. The O(n2 log n) Huang & Tseng algorithm [9] for computing the number of times that a region is covered by non-unit disks can be used to find coverage holes in a region and the border of the covered area, which is composed of a series of (at most n) arcs. What interests us here is when the border of coverage intersects with the triangle. If the entire triangle is covered, then it is disregarded. In the second case, a series-of-arcs border will pass through. There are now several subcases. If the triangle’s Voronoi point q (its center) is not covered, then we proceed as above, ignoring the partial coverage. If it is covered, it may be covered from one direction or from two. Let the triangle’s three vertices be A, B, C. Let dAq be the distance between points A and q, and let CAq be the disk of radius dAq centered at A. Say that q is covered from direction A if the intersection of the triangle and CAq is contained in the cover.

Cheap or Flexible Sensor Coverage

257

First, suppose that q is covered from only one direction, say A. Then we extend the line segment A − q until it intersects with the border at some point q (by elementary geometry). We then give sensors B and C the (identical) radii that allow them to intersect at q , which will completely cover the triangle. Clearly the meeting point of B and C cannot lie outside the coverage region, so if the radii are minimized it must lie on the border. Also, if it lay anywhere else on the border, one radius would decrease and one would increase, violating MMF. Second, suppose that q is covered from two directions, say A and B. Then C is given the radius that makes it intersect with the farthest away border crossing point contained in the triangle, thus covering the triangle. Lemma 5. Even if T has > 1 uncovered regions, its current cost can be found in poly time. Proof. In this case, it could indeed happen that different minimal radii are required to cover the entire triangle (not considering others). In such a case, we only treat the largest of these radii as being the required amount to cover this triangle. If this triangle is chosen as the critical one, only those radii are assigned, and the triangle (now more fully covered) waits for a future round to be fully covered. We note the following two properties of the procedure: in each round, the assigned (one, two or three) radii are identical; no previously assigned sensor radius will ever grow larger. Also, for reasons similar to those in the 1-d case, the order of tie-breaking will not matter. We conclude by showing running time. Theorem 3. The MMF algorithm runs in polynomial time. Proof. The Voronoi diagram computation (using Fortune’s algorithm e.g.) is performed in O(n log n) time. In each round, the costs of each triangle can be found in poly time. Even if these computations, as well as the computation of the union of disks, are done from scratch, each round takes poly time. Since in each round at least one sensor is assigned its (final) radius, the total running time is therefore polynomial.

5 Conclusion In this paper we considered a large family of interrelated problems. The solutions for some problems involve general-position of precision assumptions, of varying strength. One future research direction is in weakening these assumptions. Perhaps the most interesting open problem is to efficiently approximate D ISC 2 D S UM W IG and bounded D ISC 2 D S UM R AD. Bounded D ISC 2 D S UM R AD is arguably much more realistic than the standard version, in which a single huge sensor radius could be used to cover the entire region. Other open problems include finding an optimal algorithm for C TN 1 D S UM R AD and extending S UM W IG to non-unit disks. For this last setting interesting objective functions might involve the relative amount of wiggle room, i.e. wi /ri . We emphasize that the second change can also reasonably be combined with the original objective function. In the standard sensor radius minimization problem, it is always a feasible solution to make one sensor disk large enough to cover all clients. For

258

A. Bar-Noy et al.

large coverage areas, assuming arbitrarily large sensor ranges is unlikely to be realistic. We call this the bounded S UM R AD problem. Other potential directions for future research include the so-called 1.5 dimension (in which sensors xor clients are constrained to one dimension), and probabilistic wiggle room, which can be reduced to probabilistic disks with certain distributions. Acknowledgements. This research was sponsored by US Army Research Laboratory and the UK Ministry of Defence and was accomplished under Agreement Number W911NF-06-3-0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US Army Research Laboratory, the US Government, the UK Ministry of Defence, or the UK Government. The US and UK Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

References 1. Alt, H., Arkin, E.M., Bronnimann, H., Erickson, J., Fekete, S.P., Knauer, C., Lenchner, J., Mitchell, J.S.B., Whittlesey, K.: Minimum-cost coverage of point sets by disks. In: SoCG 2006 (2006) 2. Bertsekas, D.P., Gallager, R.G.: Data Networks, 2nd edn. Prentice Hall, Englewood Cliffs (1991) 3. Bil´o, V., Caragiannis, I., Kaklamanis, C., Kanellopoulos, P.: Geometric clustering to minimize the sum of cluster sizes. In: Brodal, G.S., Leonardi, S. (eds.) ESA 2005. LNCS, vol. 3669, pp. 460–471. Springer, Heidelberg (2005) 4. Brass, P.: Bounds on coverage and target detection capabilities for models of networks of mobile sensors. ACM T. on Sensor Networks 3(2) (2007) 5. Erlebach, T., Jansen, K., Seidel, E.: Polynomial-time approximation schemes for geometric graphs. In: SODA 2001 (2001) 6. Gibson, M., Kanade, G., Krohn, E., Pirwani, I., Varadarajan, K.: On clustering to minimize the sum of radii. In: SODA 2008 (2008) 7. Hefeeda, M., Ahmadi, H.: A probabilistic coverage protocol for wireless sensor networks. In: ICNP 2007 (2007) 8. Hochbaum, D.S., Maass, W.: Approximation schemes for covering and packing problems in image processing and VLSI. J. ACM 32(1) (1985) 9. Huang, C.-F., Tseng, Y.-C.: The coverage problem in a wireless sensor network. Mobile Networks and Applications 10(4), 519–528 (2005) 10. Hunt III, H.B., Marathe, M.V., Radhakrishnan, V., Ravi, S.S., Rosenkrantz, D.J., Stearns, R.E.: NC-approximation schemes for NP- and PSPACE-hard problems for geometric graphs. J. Algorithms 26(2) (1998) 11. Johnson, M.P., Sarioz, D., Bar-Noy, A., Brown, T., Verma, D., Wu, C.-W.: More is more: the benefits of denser sensor deployment. In: INFOCOM 2009 (2009) 12. Kershner, R.: The number of circles covering a set. Amer. J. Math. 61, 665–671 (1939) 13. Lenchner, J.: A faster dynamic programming algorithm for facility location. In: FWCG 2006 (2006) 14. Lev-Tov, N., Peleg, D.: Polynomial time approximation schemes for base station coverage with minimum total radii. Computer Networks 47(4), 489–501 (2005) 15. Vazirani, V.V.: Approximation Algorithms. Springer, Heidelberg (2001) 16. Zhang, H., Hou, J.: Maintaining sensing coverage and connectivity in large sensor networks. In: WTASA 2004 (2004)

MCP: An Energy-Eﬃcient Code Distribution Protocol for Multi-Application WSNs Weijia Li, Youtao Zhang, and Bruce Childers Computer Science Department, University of Pittsburgh, Pittsburgh, PA 15260

Abstract. In this paper, we study the code distribution problem in multi-application wireless sensor networks (MA-WSNs), i.e., sensor networks that can support multiple applications. While MA-WSNs have many advantages over traditional WSNs, they tend to require frequent code movements in the network, and thus here new challenges for designing energy eﬃcient code dissemination protocols. We propose MCP, a stateful Multicast based Code redistribution Protocol for achieving energy eﬃciency. Each node in MCP maintains a small table to record the interesting information of known applications. The table enables sending out multicast-based code dissemination requests such that only a subset of neighboring sensors contribute to code dissemination. Compared to broadcasting based schemes, MCP greatly reduces signal collision and saves both the dissemination time and reduces the number of dissemination messages. Our experiments results show that MCP can reduce dissemination time by 25% and message overhead by 20% under various network settings.

1

Introduction

Wireless sensor networks (WSNs) have recently emerged as a promising computing platform for applications in non-traditional environments, such as deep sea and wild ﬁelds. They are usually under tight memory and energy constraints, e.g., a Telos sensor node has only 48KB program memory [15]. Many WSNs can load only one application per node and the same application to all nodes in the network. We refer to these WSNs as single-application WSNs or SA-WSNs. While one sensor is usually small and cheap, as the network size scales, a large WSN may consist of thousands of sensors making it economically less appealing to run just one application. Recently, researchers have envisioned the wide adoption of multi-application WSNs or MA-WSNs, which can support several applications in one network infrastructure [20,17]. In a MA-WSN, a sensor stores the code of multiple applications in its external ﬂash memory and loads the selected application into its program memory for the desired functionality. The latter program is referred to as the active application. The recent technology and research advances clear ways for adopting MA-WSN design. Firstly it is now possible to integrate more memory (especially ﬂash memory) with the same budget [19]; secondly diﬀerent sensing, clustering, routing and data aggregation protocols have been proposed to reduce the energy consumption of performing B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 259–272, 2009. c Springer-Verlag Berlin Heidelberg 2009

260

W. Li, Y. Zhang, and B. Childers

functionalities of one application [1]. As a result, it is now possible to support multiple applications during the lifetime of a MA-WSN. MA-WSNs have many advantages over SA-WSNs. For example, MA-WSN can be deployed in a national park to monitor both wildﬁres and animal movement. More sensors could be set to monitor the animal movement in the early morning or late afternoon when animals tend to leave from or return to their habinates; while more sensors could be set to monitor wildﬁres in the summer season when the weather is dry and the chance to catch ﬁres is high. By exploiting the same network infrastructure for both events, (1) MA-WSNs save the investment and eﬀort in deploying and testing two sensor networks; (2) the sensor network adapts better to the dynamic changing environment and even adjusts the coverage according to the need. However, not all nodes in MA-WSNs have the code of all running applications. Some sensors may need to switch to run the code that can be found neither in their ﬂash memory nor from their neighboring sensors. This results in more code movements and makes it critical to design energy-eﬃcient post-deployment code dissemination/redistribution protocols in MA-WSNs. Most existing code dissemination protocols, such as Deluge [5], MNP [6], Stream [14] are designed to disseminate the same code to all sensors in the network. While it is possible to adopt a naive drop-after-receive scheme that discards unnecessary code after dissemination, applying these protocols in MA-WSNs tends to introduce high overhead. A recent proposed protocol, Melete [20], has a similar design goal. However, it employs broadcast strategy in dissemination, which introduces signiﬁcant signal collision and communication overhead in disseminating applications with large code sizes. MDeluge [21] uses a tree-based routing approach to disseminate the application code to the sensors that need to run the application. The pre-computed routing table is ﬁxed during one application upgrade, thus it is subject to the network conjestion and single point failure. In this paper, we propose a multicast-based code redistribution protocol, MCP, for the purpose of achieving energy eﬃciency. MCP employs a gossip-based source node discovery strategy. Each sensor summarizes the application information from overheard advertisement messages. Future dissemination requests are forwarded to nearby source nodes rather than ﬂooding the network. Our experiments show that MCP greatly reduces both dissemination time and message overhead, and achieves 10-20% reductions in various settings. In the remainder of the paper, we describe the code dissemination background in Section 2 and the MCP protocol in Section 3. Section 4 evaluates the eﬀectiveness of MCP under diﬀerent settings. Related work is discussed in Section 5, andSection 6 concludes the paper.

2 2.1

Code Dissemination and Problem Statement Code Dissemination

As shown in Fig. 1, a WSN consists of one sink node and a number of sensors, e.g. MICA2 nodes. The sink node connects directly to the PC and thus has

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

261

Fig. 1. A multi-application WSN (MA-WSN)

no power and computation restrictions. Each MICA2 node has 128KB program memory, 4KB data memory and 512KB external ﬂash memory. A remote sensor in MA-WSN keeps the code images of several applications in its external ﬂash and loads the active application to its program memory. To execute a diﬀerent application, the bootloader of the sensor swaps in the desired code image and reboots the sensor. A recent study showed that to support eﬀective dissemination, the whole code dissemination functionality or at least part of it should be integrated in the active application [14]. Thus, a sensor normally performs the operations of the active application and enters the code dissemination mode only after receiving special primitives, i.e., data packets with special opcode bits. An application to be disseminated is usually divided into a sequence of pages, each of which consists of multiple data packets (Fig. 2). In TinyOS each packet is 29 bytes long and contains a 23 byte payload. To adapt to the lossy wireless communication channel, Deluge [5] disseminates code at page granularity in increasing order of page number. That is, a requesting sensor has to ﬁnish receiving all the packets in one page before sending out requests for the next page; however, packets within one page may be received out of order as some packets may be lost and need to be retransmitted. During code dissemination, the requesting sensor buﬀers packets from the current page in data memory and writes to the external ﬂash after receiving the whole page. 2.2

Problem Statement

The problem that we study in this paper is to design an energy-eﬃcient code dissemination protocol in MA-WSNs. We illustrate the protocol design challenges using the following example. As shown in Fig. 1, three applications are distributed across diﬀerent nodes in a network. A problem arises when there is a need to reprogram some nodes to run application A. There are two existing approaches. A naive solution is to directly apply Deluge and disseminate application A from the sink to all sensors. After dissemination, the nodes that do not need A discard the code from their storage. The solution is clearly not a good choice due to unnecessary packets transmissions to the nodes that don’t need it. The other solution is to let requesting nodes initiate code

262

W. Li, Y. Zhang, and B. Childers Remote Sensor active application temporary data application 1 application 2

packet 1 packet 2 ... packet k packet 1 packet 2 ... packet k

program flash SRAM external flash

packet 1 packet 2 ... packet k (a) Sensor Memory

page 1

page 2

page n

(b) Disseminating application in pages and packets

Fig. 2. Sensor memory organization and code dissemination in pages and packets

dissemination and fetch A from nearby sensors. Melete [20] is such a protocol — the nodes that need to run A broadcast their requests within a controlled range and discover the source nodes that have A. Sources then send back the requested data packets. However, as a stateless protocol, Melete does not record the source nodes and has to discover them repeatedly. When transmitting applications with multiple pages, multiple sources within the range may respond and thus create signiﬁcant signal collision. In this paper, our goal is to design an energy-eﬃcient code dissemination protocol for MA-WSNs. Our target is to reduce both dissemination completion time and the number of messages transmitted during dissemination.

3 3.1

The MCP Protocol Overview

An overview of our protocol is as follows. – Sensors in MCP periodically broadcast ADV messages to advertise their knowledge about running applications in the network, which is similar to Deluge. Each sensor summarizes its overheard ADV messages in an application information table (AIT). – To reprogram a subset of sensors, the sink ﬂoods a dissemination command that guides which sensors should switch to run application A. For example, a command “[B→A, p=0.25]” indicates that the sensors whose active application is “B” should switch to “A” with a 25% probability. That is 25% of the nodes that are currently running application “B” will switch to “A”. – After receiving the command from the sink, each sensor identiﬁes its dissemination role as one of the followings. (i) a source if the sensor has the binary of application A;

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

263

(ii) a requester if the sensor does not have the binary of A but needs to switch to run A; or (iii) a forwarder if the sensor is neither a source nor a requester. – A requester periodically sends out requests (i.e., REQ messages) to its closest source, until it acquires all the pages of application A. Instead of broadcast, the REQ messages are sent to the source via multicast. A requester resends the REQ message until it timeouts. It tries to request data from each source node several times before marking the node as a temporary non-available source. – A source node responds with the data (i.e., Data messages) that contain code fragments while a forwarder forwards both request and data packets. Similar to Melete and Deluge, MCP has three types of messages: an ADV message that advertises interesting applications; a REQ message that asks for packets of a particular page; and a Data message that contains one data packet (i.e, a piece of code segment). 3.2

ADV Message and Application Information Table (AIT)

In MCP, each sensor periodically broadcasts ADV messages, and summarizes the information of overheard ADV messages into a small application information table (AIT). Fig. 3 illustrates the algorithm. Each ADV message contains the information of one application: (i) an application ID and version number; (ii) the number of pages of the application; (iii) the information of two closest source sensors — the source ID and number of hops to the source (S, H); (iv) the CRC checksum. If a sensor has multiple known applications, it advertises them in a round-robin fashion. Note that a sensor may not have the code images of all its known applications. The AIT summarizes the overheard ADV messages. In addition to the application summary, AIT stores up to three closest source nodes for each known application, and the uplink sensor ID for each source, i.e., from which the source information was received. The size of each application entry in the AIT is 12 bytes. Assume that the number of the applications running in the network is 10, the AIT size will be only 120 bytes, which makes it ﬁt perfectly in the program memory. When an incoming ADV message contains new information, the corresponding entry in the AIT is updated. Assume a sensor S1 receives an ADV message from S2, and the message identiﬁes two nearby sources (S3, H3) and (S4, H4) where H3 and H4 indicate the number of hops from S2 to sources S3 and S4. If S1 already records the information of three sources (S5, H5, U5), (S6, H6, U6), and (S7, H7, U7), then it updates the AIT table according to the following rules. – If one entry in AIT table records the previous message from the same uplink S2 and it refers to the same source, e.g. S5=S3 and U5=S2, then the information in the ADV message represents the up-to-date source information and replaces the old entry.

264

W. Li, Y. Zhang, and B. Childers Network: Assume n1 has A1; n3, n5 n9 will change to A1 n1

n2

n3

n4

n5

n6

n7

n8

n9

On Node N9: Application ID

version

# pages

node ID

hop #

uplink ID

A1

1

8

n1 n3 n5

4 2 2

n8 n6 n8

A2

1

8

... ... ...

... ... ...

... ... ...

Application ID

version

# pages

node ID

hop #

uplink ID

A1

1

8

n1 -

1 -

n1 -

On Node N4:

Fig. 3. Application Information Table

– If one entry in the AIT records a longer path to an advertised source, e.g. S5=S3, U5=S2, and H5>(H3+1), then the hop count and uplink node from the ADV message replace those in the AIT. – If the advertised source cannot be found in the AIT, and there is an invalid entry in the table, then the new source is inserted into the table. – If the ADV message advertises a closer source than one of those in the AIT table, then the closer source replaces the farthest source in the AIT. Each sensor advertises the application in the AIT in a round-robin fashion, and prioritizes the applications whose entries have been recently updated: (i) the applications whose sources were recently updated are advertised before those that were not; (ii) in one round, the applications whose sources were recently updated are advertised three times while others are advertised once. In addition to normal ADV advertisement, an application is advertised if the sensor receives a broadcast request for that application, as we elaborate next. 3.3

Request Multicasting

In MCP, a requester continues to send out request messages until it receives all pages of the target application. Given the target application, the requester searches the AIT for a closeby live source and constructs a REQ message as follows REQ = [S, H, pgNum, bv]

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

(S, 3) R

265

S

Fig. 4. Gradient-based request routing (R and S are requester and source nodes respectively; h=2; δ=1)

where S indicates the selected source node, H indicates the maximum number of hops that the message may travel, pgNum and bv indicate the current working page and the requested packets in the page. If the AIT records more than one source node, then the requester selects the closest live source and sets H to h+δ where h is the number of hops to S (recorded in the AIT), and δ is the hop count slack allowed in the dissemination. Fig. 4 illustrates the involved nodes when h=2 and δ=1. These nodes routed through a gradient-based region [1] to the source. A requester continues sending the REQ messages when it can not ﬁnish the page before timeout. After several tries, it marks the source that it tried to reach as an unreachable source. The number of tries varies based on the distance to the source. If the AIT does not record any nearby source, then the requester sets S to be null, indicating the REQ message is sent to all neighbors. After receiving a broadcast request, an idle forwarder forwards the request unless the message has travelled the maximum number of hops; an idle source node always responds with requested packets. Since each requester sends out REQ messages independently, diﬀerent requesters may work on diﬀerent pages. MCP allows node preemption. If a REQ message asking for page x reaches a working node who is currently working on page y, and x + 1 < y, then the node quits the current state and switches to serve the request. If the node is a forwarder, then it forwards the request; if the node is a requester or a source, then it must have the requested page and thus will respond with the requested packets. The node enters the idle state after serving the request. 3.4

Caching

During code dissemination, some requesters or forwarders, while working on the current page, may overhear packets from pages with larger indices. As code pages are requested strictly in increasing order, a requester will work on large-numberindexed pages, and a forwarder has a high possibility to receive requests for these pages. To improve transmission eﬃciency, sensors in MCP buﬀer such packets in their data memory. The space that can be dedicated to caching on a wireless sensor is usually very limited. While it is possible to exploit external ﬂash for caching, accessing external ﬂash is slow and writing it has to be performed in 256-byte blocks, which complicates the design and wastes the energy. Caching on a requester is straightforward as the sensor always caches the next several pages in addition to the current working page. However, it is slightly more

266

W. Li, Y. Zhang, and B. Childers

complicated on a forwarder node as it gets requests from diﬀerent requesters that work on diﬀerent pages and may suﬀer from thrashing if it takes turns to serves these requests. In MCP, a forwarder gives priority to pages with smaller indices. We set a timer for the cached page and clear the page after serving a request or timeout.

4 4.1

Experiments Settings

We implemented MCP on the TinyOS [18] platform. For comparison, we also implemented Melete [20] and studied various network settings using TOSSIM [9]. We simulated mesh MA-WSNs of diﬀerent sizes. We set the default spacing factor to 15 and model the lossy communication channel using the tool provided by TinyOS. There are four applications each of which is uniformly distributed across the network. In the default setting, 30% of the sensors have application A and there is a request from the sink to reprogram 20% of the other sensors to run A. MCP disseminates the code from in-network sources instead of the sink. 4.2

Message Overhead

Fig. 5 shows the breakdown of the number of messages with diﬀerent dissemination protocols. Without considering advertisement messages, Melete and Deluge have about the same message overhead, which was also reported in [20]. There are a large number of ADV messages in Deluge, and a negligible number in Melete. The reason of such diﬀerence is Deluge depends heavily on incoming ADV message, e.g., a sensor node only sends out new requests if it receives ADV messages

3500 Data Msg Req Msg Adv Msg

Message Overhead (KBytes)

3000

2500

2000 Deluge Melete

1500

Our Alg. 1000

500

0

8x8

10x10

12x12

14x14

Network Sizes

Fig. 5. Message overhead

16x16

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

267

6000 Deluge Melete MCP

Dissemination Time (sec)

5000

4000

3000

2000

1000

0

8x8

10x10

12x12

14x14

16x16

Network Sizes

Fig. 6. Dissemination Time

indicating its neighbors have more up-to-date data. Instead, in Melete, requesters receive the command from the sink code and then know the target application and its size. The requesters can proactively send out more requests after timeouts or receiving one complete page. The ADV messages contribute to 37-40% of the total message overhead in Deluge. Our scheme takes a similar approach as Melete but requires some ADV messages to update the AIT before, during and after the code switch. The ADV’s overhead is low compared to the request and data transfer message overhead. On average our scheme reduces about 20% of the message overhead from Melete. The main reason for this reduction is that Melete tends to have multiple responders within a small range and has a higher possibility of signal collision. MCP alleviates the problem by chosing one closeby source, which reduces the number of data packets in transmission. 4.3

Completion Time

Fig. 6 compares the dissemination completion time of the diﬀerent protocols. For the Deluge result, we record the time interval used by all requesters to complete the new code downloading. In practice, the Deluge protocol may still proceed to ﬂood all sensors since it is not designed to update a subset of sensors. MCP requires less time to ﬁnish dissemination; on average it saves 45% and 25% over Deluge and Melete respectively. 4.4

Sensitivity to Node Distribution

Fig. 7 illustrates message overhead with a diﬀerent number of sources and requesters. We omit the dissemination time ﬁgure which exhibits similar results. Along the X axis, (a,b) denotes that out of all the sensor nodes, a% sources and

268

W. Li, Y. Zhang, and B. Childers 3500 Deluge Melete MCP

Message Overhead (KBytes)

3000

2500

2000

1500

1000

500

0

(20,20)

(30,20)

(40,20)

(30,10)

(30,30)

Node Distribution (X,Y)=(Sources, Requesters)

Fig. 7. Dissemination with Diﬀerent Number of Sources and Requesters

3500 Deluge Melete MCP

Message Overhead (KBytes)

3000

2500

2000

1500

1000

500

0

EvenD

CornerD

SideD

Fig. 8. Dissemination with Uneven Source/Requester Node Distribution

b% requesters are randomly selected in the ﬁeld. We observed that the overhead tends to increase with more requesters and fewer sources. The diﬀerence is not signiﬁcant. Fig. 8 compares the message overhead when sources and requesters are distributed with location concentration. EvenD denotes that all nodes are evenly distributed. CornerD denotes that sources and requesters are distributed at the two diagonal corners of the rectangle ﬁeld. SideD denotes that sources and requesters are distributed along two sides of the ﬁeld. From the ﬁgure, Melete has better performance than Deluge under even distribution. However, it generates signiﬁcant conﬂicts and performs worse than Deluge when the nodes are unevenly deployed. MCP has consistently better results over Melete and Deluge.

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

6000

269

Deluge Melete MCP

Message Overhead (KBytes)

5000

4000

3000

2000

1000

0

N=8

N=10

N=12

N=14

N=16

Number of Pages

Fig. 9. Dissemination with Diﬀerent Number of Pages 2500 Melete MCP

Message Overhead (KBytes)

2000

1500

1000

500

0

N=1

N=2

N=3

N=4

Number of Cache Entries

Fig. 10. Dissemination with Diﬀerent Cache Sizes

For the corner and side settings, MCP and Deluge are similar as almost all nodes are involved in the dissemination. 4.5

Sensitivity to Application Sizes

Fig. 9 shows message overhead with diﬀerent application sizes. Due to the epidemic dissemination, Deluge exhibits approximately linear message overhead when increasing the application size from 8 to 16 pages. Both Melete and MCP greatly reduce the communication overhead; however, they have slightly more than linear message overhead due to independent page requesting from requesters. MCP has a nearly constant message overhead reduction versus Melete, varying from 17.5% for 8 pages to 18.1% for 16 pages.

270

4.6

W. Li, Y. Zhang, and B. Childers

Sensitivity to Cache Sizes

Fig. 10 summarizes message overhead of Melete and MCP with diﬀerent cache sizes, i.e., the number of code pages that may be cached in memory. Here N=1 denotes that there is no caching. From the ﬁgure, MCP achieves signiﬁcant communication overhead reduction when caching one or two future pages, and diminishing beneﬁts when with larger cache sizes. The reason is that in MCP, a request message can preempt a working node (a source, a requester, or a forwarder) if that node works on a page with a larger page number and the page index diﬀerence is bigger than one. In this way, MCP prioritizes slow requesters such that they can keep up the pace with the nearby dissemination and take advantage of cached packets on the neighboring sensors.

5

Related Work

Since sensors are left unattended after deployment, post-deployment code dissemination is an important topic in designing wireless sensor networks. Besides Deluge [5] and Melete [20] that we have discussed in the paper, many other protocols have been proposed. MNP [6] and Trickle [10] are protocols designed in TinyOS to support multihop code dissemination. MDeluge [21] uses Deluge [5] to manage code dissemination for MA-WSNs; however, it uses a ﬁxed routing table, and is subject to the network conjestion and single point failure. Infuse [7] is a TDMA-based dissemination protocol. Impala/ZebraNet [13] provides a middleware layer to support code update. These protocols propagate the desired code to all sensors in the network. Reducing the amount of data transferred during dissemination is an eﬀective approach to reduce overhead. [16] and [4] proposed to generate and propagate a diﬀ script instead of the complete binary code image. [12] performed update conscious register allocation and data allocation to reduce the script size. The code size can also be reduced by disseminating modules with symbolic names and performing remote linking before execution [2], or by disseminating virtual machine primitives [11]. To defend against security attacks during code dissemination, [8] integrated digital signature and hashing chaining mechanisms to ensure page-level data integrity. Since packets within one page may arrive out of order, security attacks within one page is still possible. [3] performs packet level security checks to defend such attacks with the tradeoﬀ to enforce a stronger dissemination order within page.

6

Conclusions

In this paper, we propose a multicast-based code dissemination protocol, called MCP, for eﬃcient code dissemination in MA-WSNs. In MCP, each sensor summarizes overheard information of nearby sources in a small table such that its

MCP: An Energy-Eﬃcient Code Distribution Protocol for MA WSNs

271

dissemination requests can be multicasted to selected source. Compared to design that ﬂoods requests to all neighboring sensors, MCP signiﬁcantly reduces signal conﬂicts. Our experimental results show that MCP can reduce dissemination time by 25% and message overhead by 20% on average.

Acknowledgement This work is supported in part by NSF under grant CCF-0641177, CNS-0720595, CCF-0811352, CCF-0811295, CNS-0720483, CCF-0702236, and CNS-0551492.

References 1. Chu, M., Haussecker, H., Zhao, F.: Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks. International Journal on High Performance Computing Applications 16(3), 90–110 (Fall 2002) 2. Dunkels, A., Finne, N., Eriksson, J., Voigt, T.: Run-Time Dynamic Linking for Reprogramming Wireless Sensor Networks. In: ACM International Conference on Embedded Networked Sensor Systems (SenSys), pp. 15–28 (2006) 3. Dutta, P.K., Hui, J.W., Chu, D.C., Culler, D.E.: Securing the Deluge Network Programming System. In: International Symposium on Information Processing in Sensor Networks (IPSN), pp. 326–333 (2006) 4. Jeong, J., Culler, D.E.: Incremental Network Programming for Wireless Sensors. In: IEEE Sensor and Ad Hoc Communications and Networks (SECON), pp. 25–33 (2004) 5. Hui, J.W., Culler, D.: The Dynamic Behavior of a Data Dissemination Protocol for Network Programming at Scale. In: International Conference on Embedded networked sensor systems (SenSys), pp. 81–94 (2004) 6. Kulkarni, S.S., Wang, L.: MNP: Multihop Network Reprogramming Service for Sensor Networks. In: IEEE International Conference on Distributed Computing Systems (2005) 7. Kulkarni, S.S., Arumugam, M.: Infuse: A TDMA Based Data Dissemination Protocol for Sensor Networks. In: Conference on Embedded Networked Sensor Systems (2004) 8. Lanigan, P.E., Gandhi, R., Narasimhan, P.: Sluice: Secure Dissemination of Code Updates in Sensor Networks. In: The 26th Intl. Conference on Distributed Computing Systems (2006) 9. Levis, P., Lee, N., Welsh, M., Culler, D.: TOSSIM: Accurate and Scalable Simulation of Entire TinyOS Applications. In: International Conference on Embedded networked sensor systems (SenSys) (2003) 10. Levis, P., Patel, N., Shenker, S., Culler, D.: Trickle: A Self-regulating Algorithm for Code Propagation and Maintenance in Wireless Sensor Networks. In: USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI) (2004) 11. Levis, P., Culler, D.: Mate: A Tiny Virtual Machine for Sensor Networks. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 85–95 (2002) 12. Li, W., Zhang, Y., Yang, J., Zheng, J.: UCC: Update-Conscious Compilation for Energy Eﬃciency in Wireless Sensor Networks. In: ACM Programming Languages Design and Implementation (PLDI) (2007)

272

W. Li, Y. Zhang, and B. Childers

13. Liu, T., Sadler, C.M., Zhang, P., Martonosi, M.: Implementing Software on Resource-Constrained Mobile Sensors: Experiences with Impala and ZebraNet. In: International Conference on Mobile Systems, Applications, and Services (2004) 14. Panta, R.K., Khalil, I., Bagchi, S.: Stream: Low Overhead Wireless Reprogramming for Sensor Networks. In: IEEE Conference on Computer Communications (Infocom) (2007) 15. Polastre, J., Szewczyk, R., Culler, D.: Telos: Enabling Ultra-Low Power Wireless Research. In: IPSN 2005, pp. 364–369 (2005) 16. Reijers, N., Langendoen, K.: Eﬃcient Code Distribution in Wireless Sensor Networks. In: ACM Workshop on Wireless Sensor Networks and Applications (WSNA) (2003) 17. Steﬀan, J., Fiege, L., Cilia, M., Buchman, A.: Towards Multi-Purpose Wireless Sensor Networks. In: The 2005 Systems Communications, pp. 336–341 (2005) 18. TinyOS website, http://www.tinyos.net/ 19. MICAz Wireless Measurement System, http://www.xbow.com/ 20. Yu, Y., Rittle, L.J., Bhandari, V., LeBrun, J.B.: Supporting Concurrent Applications in Wireless Sensor Networks. In: International Conference on Embedded networked sensor systems (SenSys), pp. 139–152 (2006) 21. Zheng, X., Sarikaya, B.: Code Dissemination in Sensor Networks with MDeluge. In: Sensor and Ad Hoc Communications and Networks (Secon), pp. 661–666 (2006)

Optimal Allocation of Time-Resources for Multihypothesis Activity-Level Detection Gautam Thatte1, , Viktor Rozgic1 , Ming Li1 , Sabyasachi Ghosh1 , Urbashi Mitra1 , Shri Narayanan1, Murali Annavaram1 , and Donna Spruijt-Metz2 1

Ming Hseih Department of Electrical Engineering 2 Keck School of Medicine University of Southern California, Los Angeles, CA {thatte,rozgic,mingli,sabyasag,ubli,annavara,dmetz}@usc.edu, [email protected]

Abstract. The optimal allocation of samples for activity-level detection in a wireless body area network for health-monitoring applications is considered. A wireless body area network with heterogeneous sensors is deployed in a simple star topology with the fusion center receiving biometric samples from each of the sensors. The number of samples collected from each of the sensors is optimized to minimize the probability of misclassiﬁcation between multiple hypotheses at the fusion center. Using experimental data from our pilot study, we ﬁnd equally allocating samples amongst sensors is normally suboptimal. A lower probability of error can be achieved by allocating a greater fraction of the samples to sensors which can better discriminate between certain activity-levels. As the number of samples is an integer, prior work employed an exhaustive search to determine the optimal allocation of integer samples. However, such a search is computationally expensive. To this end, an alternate continuous-valued vector optimization is derived which yields approximately optimal allocations which can be found with signiﬁcantly lower complexity.

1

Introduction

Wearable health monitoring systems coupled with wireless communications are the bedrock of an emerging class of sensor networks: wireless body area networks (WBANs). Such networks have myriad applications, including diet monitoring [16], detection of activity [3,2], and health crisis support [6]. This paper focuses on the KNOWME network, which is targeted to applications in pediatric obesity, a developing health crisis both within the US and internationally. To understand, treat, and prevent childhood obesity, it is necessary to develop a multimodal system to track an individual’s level of stress, physical activity, and blood glucose, as well as other vital signs, simultaneously. Such data must also be anchorable to

This research is supported by the National Center on Minority Health and Health Disparities (NCMHD) (supplement to P60 MD002254) and Qualcomm.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 273–286, 2009. c Springer-Verlag Berlin Heidelberg 2009

274

G. Thatte et al.

(C)

(B)

(A)

Fig. 1. The Nokia N95 cellphone fusion center (A), and the Alive Technologies oximeter sensor (B) and ECG sensor (C)

context, such as time of day and geographical location. The KNOWME network is a ﬁrst step in the development of a system that could achieve these targets. A crucial component of the KNOWME network is the uniﬁed design and evaluation of multimodal sensing and interpretation, which allows for automatic recognition, prediction, and reasoning regarding physical activity and sociocognitive behavior states. This accomplishes the current goals of observational research in obesity and metabolic health regarding physical activity and energy expenditure (traditionally carried out through careful expert human data coding), as well as enabling new methods of analysis previously unavailable, such as incorporating data on the user’s emotional state. The KNOWME network utilizes heterogeneous sensors, which send their measurements to a Nokia N95 cellphone via Bluetooth, as shown in Figure 1. The Bluetooth standard for data communication uses a “serve as available” protocol, in which all samples taken by each sensor are collected by the fusion center. Though this is beneﬁcial from the standpoint of signal processing and activitylevel detection, it requires undesirably high energy consumption: with a fully charged battery, the Nokia N95 cellphone can perform over ten hours of telephone conversation, but if the GPS receiver is activated, the battery is drained in under six hours [20]. A similar decrease in battery life occurs if Bluetooth is left on continuously. One of the aims of this paper is to devise a scheme that reduces the Bluetooth communication, thus resulting in energy savings. Our pilot study [1] suggested that data provided by certain types of sensors were more informative in distinguishing between certain activities than other types. For example, the electrocardiograph sensor was a better discriminator when the subject was lying down, while data from the accelerometer was more pertinent to distinguishing between higher-level activities. In the present work, we exploit the advantages gained by focusing on particular sensors when selecting from a speciﬁc set of hypothesized activity states, thus providing a more energyeﬃcient detection mechanism. The goal of the present work is to characterize the optimal allocation of samples for heterogeneous sensors in order to minimize the probability of misclassiﬁcation at the fusion center. Making optimal time-resource allocation a priority leads us to consider that sensors whose measurements are not currently being utilized can turn oﬀ their Bluetooth, resulting in a health-monitoring

Optimal Allocation of Time-Resources

275

application that is more energy-eﬃcient than previous models. To achieve this goal, we consider the centralized approach adopted in [1], wherein detection is performed at the Nokia N95 fusion center, and present numerical results for the M -ary hypothesis testing problem with multiple sensors. Thus, the contribution of our work is describing the optimal allocation of samples amongst heterogeneous sensors in a WBAN for activity-level detection. Speciﬁcally, we derive a lower complexity (see Section 5.2) continuous-valued vector optimization to minimize the probability of misclassiﬁcation in the multihypothesis case. We are currently developing an energy-eﬃcient algorithm using this optimal allocation of samples. The remainder of the paper is organized as follows: prior relevant work on activity-level detection and energy-eﬃcient algorithms in WBANs, and their relationship to our work is presented in Section 2. An overview of our activity-level detection system is described in Section 3. In this work, we focus on developing the timeresource allocation algorithm. Speciﬁcally, in Section 4, we describe the signal model used to develop our optimal allocation, outline the framework for minimizing the probability of misclassiﬁcation, and derive a lower complexity continuous-valued optimization problem. Numerical results based on experimental data are presented in Section 5. Finally, we draw conclusions and discuss our future work direction and extensions to the optimal time-resource allocation problem in Section 6.

2

Related Work

In recent years, there have been several projects investigating activity-level detection in a variety of frameworks. Much of the work appears to center on accelerometer data alone (e.g. [3,8,10]) with some systems employing several accelerometer packages and/or gyroscopes. On the other hand, multi-sensor systems have also been implemented and deployed for activity-level detection, context-aware sensing and speciﬁc health-monitoring applications. For example, the work of Gao et al [6] is tailored for emergency response and triage situations, while Dabiri et al [5] have developed a lightweight embedded system that is primarily used for patient monitoring. The system developed by Jovanov et al [9] is used in the assistance of physical rehabilitation, and Consolvo et al’s UbiFit system [4] is designed to promote physical activity and an active lifestyle. In these works, emphasis is on higher layer communication network processing and hardware design. In contrast, our work explicitly focuses on developing the statistical signal processing techniques required for activity-level detection. Several context-aware sensing systems and activity-level detection schemes have been designed using multiple accelerometers and heterogeneous sensors. However, the long-term deployment of some systems are constrained by the battery life of the individual sensors or the fusion center. The problem becomes more severe when Bluetooth, GPS measurements, and similar high-energy requirement features and applications are part of the sensor network. The notion of designing energy-saving strategies, well-studied and implemented in the context of traditional sensor and mobile networks [11,17], has

276

G. Thatte et al.

also been incorporated into WBANs for activity-level detection. For example, the goal of the work by Benbasat et al [2] is to determine a sampling scheme (with respect to frequency of sampling and sleeping/waking cycles) for multiple sensors to minimize power consumption. A similar scheme which minimizes energy consumption based on redistributing un-used time over tasks as evenly as possible is described in the work by Liu et al [13]. Yan et al [22] have investigated the Bluetooth and ZigBee protocols in WBANs, and developed an energy optimization scheme based on the tunable parameters of these protocols, e.g. connection latency. Our approach is diﬀerent in the fact that the energyeﬃciency of the system is a result of optimized detection performance. In the next section, we describe our experimental setup, present the signal model used to develop our optimal time-resource allocation, and outline the optimization problem which uses the probability of error metric.

3

KNOWME Activity-Detection System Overview

The target functionality of the KNOWME network for activity-level detection, and the current system implementation, is outlined in this section. We note that our current work derives the optimal time-resource allocation algorithm for the “static” case wherein we do not explicitly account for the current state of the subject, i.e. the optimal allocation of samples does not evolve as a function of time. Figure 2 shows an illustrative example of a state-transition diagram that is used to determine the a priori probablities for a particular activity-level. For example, if we know that the subject is standing, the transition probabilties from the Standing state to the Lying, Sitting, Walking and Standing states are given as 0.05, 0.4, 0.25 and 0.3, respectively. As is seen in Figure 2, the a priori probabilities are distinct for each state. These are incorporated into the probability of misclassiﬁcation metric that we use to derive the optimal timeresource allocation algorithm in Section 4.2.

Fig. 2. Example of a state-transition diagram that may be used to determine the transition probabilties in the KNOWME activity-detection system

Optimal Allocation of Time-Resources

3.1

277

System Implementation

Our current implementation of the KNOWME software connects a Nokia N95 8GB phone to multiple bluetooth-enabled body sensors which continuously collect data and relay it to the phone. The phone, which serves as the fusion center, has a Dual ARM 11 332 MHz CPU, 128MB RAM, and is a Bluetooth 2.0 EDR compliant Class 2 device, running Symbian OS 9.2. The software running on the phone is written for Python for S60 devices. The software is conﬁgured to connect to multiple Bluetooth devices; on starting the data collection, the software opens a new thread for each device speciﬁed. The data received from the sensors can be parsed and analyzed on-the-ﬂy in the write thread on the phone to decide whether to disconnect or disable a sensor or not. The thread handling that sensor can be put in a wait state after disconnecting, and can be signalled later when it is decided that we need data from that sensor again. We note that we are still currently working on implementing the optimal allocation of samples; in its current form, all sensors transmit an equal number of samples to the fusion center.

4

Problem Formulation

In this section, we ﬁrst present the signal model for our wireless body area network which is deployed in a simple star topology, and then outline our optimization problem: We minimize the probability of misclassiﬁcation at the fusion center, given samples from all the sensors. We note that obtaining the optimal allocation of samples requires an exhaustive search over the total number of samples, N , since all possible partitions of the samples between the heterogeneous sensors must be considered to ﬁnd the optimal allocation. This was considered for the binary hypothesis case with two sensors in our previous work [19]. As the total number of available samples and number of sensors increases, the combinatorial search becomes undesirably computationally expensive. To this end, we develop an analogous optimization problem which yields an approximately optimal solution, but which can be solved using lower complexity continuous-valued vector optimization techniques. The derivation of the alternate problem is outlined, and the algorithmic complexities of the two optimizations are compared; numerical simulations are presented in the following section. 4.1

Signal Model

Heterogeneous sensors are deployed in a simple star topology as shown in Figure 1. Each sensor sends its biometric samples directly to the cellphone via the Bluetooth protocol. There has been extensive research in the modeling of physiological time-series data, and the autoregressive (AR) model has been employed in a variety of contexts. Physiological hand-tremors are represented using an AR(3) model in [23], while the works in [14,15] use an AR model to estimate ElectroEncephaloGram (EEG) signals. The AR model, while not always an exact match to the data, is one of the less complicated models used since it allows

278

G. Thatte et al.

for a more thorough analysis and development of solutions. In our work, we use the AR(1) model to represent a biometric time-series. Furthermore, both the sensing and the communication of the measurements are assumed to be noisy given the measurement systems and wireless channels, respectively. We now propose the following signal model for the decoded and processed samples received by the fusion center: yi = θ + zi , i = 1, . . . , Nk

(1)

for the k-th sensor, where zi represents the independent and identically distributed (iid) zero-mean Gaussian measurement and channel noise. For a general feature/sensor Ak , θ is a normally distributed random variable, speciﬁed as θj = μjAk + wi

(2)

for hypothesis Hj , where wi represents the biometric noise and is modeled using the AR(1) model, i.e. wi = ϕwi−1 + ε , i = 2, . . . , Nk ,

(3)

for k-th sensor which has been allocated Nk samples, and ε is zero-mean Gaussian 2 with variance σjA . The choice of the Gaussian model is motivated by several k factors. An analysis of the data collected during our pilot study [1] suggested that the Gaussian model is a relatively good ﬁt for the biometric samples collected by the fusion center. A more complicated model with a better ﬁt to the data may be chosen, but the Gaussian model lends itself to analysis, and the development of tractable solutions. To simplify notation, we omit the hypothesis subscript j when expressions and deﬁnitions are applied to a generic hypothesis. We denote the number of samples sent by the K sensors as N1 , N2 , . . . , NK , respectively, and impose a constraint of N total samples, i.e. N1 + N2 + · · · + NK = N , for a speciﬁc time-period. Since the features are modeled as Gaussian, and given that the AR(1) model is linear, the M -ary hypothesis test using the model in (1) is simply the generalized Gaussian problem which is speciﬁed as Hi : Y ∼ N (mi , Σi ), i = 1, . . . , M,

(4)

where mi , Σi , i = 1, 2, . . . , M are the mean vectors and covariance matrices of the observations under the each of the M hypotheses. For completeness, we recall the density of the multivariate Gaussian given by: 1 1 T −1 fX (x1 , . . . , xN ) = exp − (x − μ) Σ (x − μ) , (5) 2 (2π)N/2 |Σ|1/2 where μ is the mean vector and Σ is the covariance matrix. Empirical data from our pilot study [1] suggests that the conditional correlation between features is relatively small; and thus, for the K features A1 , A2 , . . . , AK from the sensors,

Optimal Allocation of Time-Resources

279

the mean vector and covariance matrix for the observations for hypothesis Hj for j = 1, · · · , M are of the form ⎡ ⎤ ⎡ ⎤ Σj (A1 ) 0 0 ··· 0 μjA1 ⎢ ⎥ 0 Σj (A2 ) 0 ··· 0 ⎢ μjA2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 0 0 Σj (A3 ) · · · 0 mj = ⎢ . ⎥ and Σj = ⎢ ⎥ , (6) ⎢ ⎥ ⎣ .. ⎦ .. .. .. . . . . ⎣ ⎦ . . . . . μjAK 0 0 0 · · · Σj (AK ) respectively. Note that μjAi is a Ni × 1 vector and Σj (Ai ) is a Ni × Ni matrix. We have assumed that the samples from diﬀerent sensors are independent; this is further supported by the fact that certain sensors yield a better performance when discriminating between some subsets of activities. Given the signal models in (1) and (3), for a particular feature Ak , the covariance matrix can be expressed as Σ(Ak ) =

2 σA k T + σz2 I, 1 − ϕ2

where T is a Toeplitz matrix of the form ⎡ 1 φ φ2 ⎢ φ 1 φ ⎢ ⎢ φ2 φ 1 T = ⎢ ⎢ .. .. .. ⎣ . . . φNk −1 φNk −2 φNk −3

⎤ · · · φNk −1 · · · φNk −2 ⎥ ⎥ · · · φNk −3 ⎥ ⎥, .. ⎥ .. . . ⎦ ··· 1

(7)

(8)

and I is the Nk × Nk identity matrix. This results in the covariance matrices Σj , j = 1, . . . , M being block-Toeplitz matrices. To derive a vector optimization that circumvents an exhaustive search, we may replace the Toeplitz covariance matrices with their associated circulant covariance matrices1 given by Σ(Ak ) =

2 σA k C + σz2 I, 1 − ϕ2

where the matrix C is of the form ⎡ 1 φ ⎢ φNk −1 1 ⎢ N −2 N −1 ⎢ k φ k C = ⎢φ ⎢ .. .. ⎣ . . φ φ2 1

⎤ · · · φNk −1 · · · φNk −2 ⎥ ⎥ · · · φNk −3 ⎥ ⎥. .. ⎥ .. . . ⎦ φ3 · · · 1 φ2 φ 1 .. .

(9)

(10)

We note that the inverse of the Toeplitz covariance matrix in (7) converges to the inverse of the circulant covariance matrix in (9) in the weak sense. Sun et al [18] have derived that a suﬃcient condition for weak convergence is that the strong norm of the inverse matrices be uniformly bounded. We ﬁnd that this is the case for the matrix forms in (7) and (9) for 0 < φ < 1.

280

4.2

G. Thatte et al.

Probability of Error Derivation

We derive a closed-form approximation for the probability of error in the multihypothesis case via a union bound incorporating the Bhattacharyya coeﬃcients between pairs of hypotheses. A result by Lianiotis [12] provides an upper bound on the probability of error, independent of the prior probabilities, given as

P ( ) ≤ (Pi Pj )1/2 ρij , (11) i 0, let λ := λ(h, ε) = (h/ε)(1 + ln (1 + ς)). We prove that one can compute a set L of O(λ log λ log (1/ε)) landmarks so that if a set S of sensors covers L, then S covers at least (1 − ε)-fraction of P . It is surprising that so few landmarks are needed, and that the number of landmarks depends only on h, and does not directly depend on the number of vertices in P . We then present efficient randomized algorithms, based on the greedy approach, ˜ log λ) sensor locations to cover L; here that, with high probability, compute O(h ˜ ≤ h is the number sensors needed to cover L. We propose various extensions h of our approach, including: (i) a weight function over P is given and S should cover at least (1−ε) of the weighted area of P , and (ii) each point of P is covered by at least t sensors, for a given parameter t ≥ 1.

1 Introduction With the advances in sensing and communication technologies, surveillance, security, and reconnaissance in an urban environment using a limited number of sensing devices distributed over the environment and connected by a network is becoming increasingly feasible. A key challenge in this context is to determine the placement of sensors that provides high coverage at low cost, resilience to sensor failures, and tradeoffs between various resources. In this paper we present a landmark based approach for sensor placement in a two-dimensional spatial region, containing occluders (a common model for an urban environment in the context of the sensor-placement problem). We choose a small set of landmarks, show that it suffices to place sensors that cover these landmarks in order to guarantee a coverage of most of the given domain, and propose simple, efficient algorithms (based on greedy approach) for placing sensors to cover these landmarks. Our model. We model the 2D spatial region as a polygonal region P , which may contain holes (occluders). Let n be the number of vertices in P and ς the number of holes in P . A point p ∈ P is visible from another point q if the segment pq lies inside P .

Work on this paper was supported by NSF under grants CNS-05-40347, CFF-06-35000, and DEB-04-25465, by ARO grants W911NF-04-1-0278 and W911NF-07-1-0376, and by an NIH grant 1P50-GM-08183-01.

B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 301–314, 2009. c Springer-Verlag Berlin Heidelberg 2009

302

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

Let r ≥ 0 denote the sensing radius. We assume that a point x is covered by a sensor located at p, if xp ≤ r and p is visible from x. For a point x ∈ R2 , we define V (x) = {q ∈ P | qx ≤ r, pq ⊂ P } to be the coverage (surveillance) region of the sensor located at x (this region is aportion of the so called “visibility polygon” of x). For a set X ⊆ P , define V (X) = x∈X V (x) and V(X) = {V (x) | x ∈ X}. When r is finite, we say that the model has a finite sensing radius, otherwise, the model has an infinite sensing radius. We define a function τ : P → N, such that τ (p) is the coverage requirement of p, i.e., p should be covered by at least τ (p) sensors. A placement S ⊂ P is a finite set of locations where sensors are placed. For a point p ∈ P , define χ(p, S) = |{s ∈ S | p ∈ V (s)}|, i.e., the number of sensors that cover p. We say that S covers P if χ(p, S) ≥ τ (p), for every p ∈ P . (In Section 4, we mention a few extensions of our model.) The coverage problem asks for computing the smallest set S that covers P . If τ (p) = 1, for all p, we refer to the problem as the uniform coverage problem. In our model we assume that the polygon is given, as well as the occluder locations (otherwise, a hole detection algorithm can be used [3]), but the size of the smallest set S that covers P is unknown to us. The expected value of this measure, reported by our algorithm, approximates the actual measure within a factor of 2. We note that it is commonly assumed that connectivity between the sensors mostly happens when they have a line of sight between them, this is because: (i) There are models, mostly in the case of swarms of robots [29] where the communication is through infrared signals, (ii) in an urban environment, wireless transmissions are stronger along a street (due to the reflections from walls) while loss of energy occurs at turning corners. Related work. The sensor coverage problem has been studied extensively in several research communities, including sensor networks, machine learning, computational geometry, robotics, and computer vision [11, 18, 38, 19, 35, 27]. It is beyond the scope of this paper to mention all the related work, so we briefly mention a few of them. Coverage problems are typically cast as an instance of the discrete set-cover problem, which given a finite set S and a set of subsets of S seeks to find a minimum number of subsets whose union is S. This is known to be NP-hard and hard to approximate. However, a standard greedy algorithm [28] gives O(log |S|)-approximation. This algorithm also generalizes to set multi-cover problem with a O(log |S|)-approximation [32]. However, these algorithms do not generalize to the continuous set-cover problem. The early work in the sensor-network community on sensor coverage focused on covering an open environment (no occluders), so the problem reduced to covering P by a set of disks [38] (see also [25]). The problem is known to be NP-hard and efficient approximation algorithms are known [9,34] . A variant of the coverage problem, which involves communication connectivity constraints among sensors, has been studied in [7, 24, 37]. In an urban environment, one has to consider line-of-sight constraints and other interference while placing sensors. The so-called art gallery problem and its variants, widely studied in computational geometry and robotics, ask for covering P by a set of sensors so that every point in P is visible from at least one sensor. The art-gallery problem is known to be APX-hard [16], i.e., better than constant-factor approximation cannot be computed in polynomial time unless P = N P . If there is no restriction on the placements of sensors, no approximation algorithm is known. However, if the placement needs to be chosen from a finite set of candidate locations (e.g. uniform grid),

Efficient Sensor Placement for Surveillance Problems

303

polynomial-time log-approximation algorithms are known; see [14, 10] and the references therein. Gonz´alez-Ba´nos and Latombe [18] proposed to choose a set of random positions as candidate locations for sensors. They proved that the randomized approach leads to a good approximation algorithms under some assumptions on P ; similar approached have been widely used in the context of path planning. Efrat et al. [15] developed approximation algorithms for the non-uniform coverage for τ (p) = 2, 3 and with additional constraints on coverage, under the assumption that the sensor locations are chosen from a set of grid points. Several effective heuristics to cover P are presented in [4]. The problem of selecting a set of sensors to provide k-coverage, i.e, covering each point of the domain with at least k sensors, (and in some cases also to form a connected network), from a prior dense deployment has been studied for specific settings such as grid deployment, Landmarks random uniform deployment or for specific types of sensDeepest point ing regions, see for example [21, 23]. Meguerdichian et. al [30] study variants of the coverage problem, namely maximal breach (resp. support) path problem in which given a deployment of sensors and a set of source and Fig. 1. A set of landmarks and destination locations the goal is to find a path between a their respective coverage re- source and a destination so that for any point on the path gions the distance to the closest sensor is maximized (resp. minimized). In a related problem, studied at [31], a deployment of the sensors is given and the goal is to find a path which is the least covered; this is an instance of the so called exposure problem. These solutions assume a deployment of sensors and try to characterize the locations for additional sensors so as to minimize breach. However, our approach is more pro-active, and all the key areas of a domain to be monitored can be given suitable weights, and our algorithm returns a set of location for the sensors which ensures that the key areas are monitored. Because of measurement errors, the dependence of the coverage quality on the distance, and correlation between the observations of nearby sensors, statistical models have been developed for sensor coverage and entropy based methods that try to maximize the information acquired through the sensors; see e.g. [19]. These methods are particularly useful for monitoring spatial phenomena. Our results. We follow a widely used approach for sensor coverage, i.e., sampling a finite a set of points and then using a greedy algorithm to compute the location of sensors, but with a twist: instead of sampling the set of candidate placements for sensors, we sample landmarks that should be covered. This twist has the advantage that unlike previous studies, we can obtain a provable bound on the performance of the algorithm without any assumption on P . There are two main contributions of the paper. First, we prove that the number of landmarks needed is surprisingly small—it is independent of the number of vertices in P . Second we describe simple, efficient algorithms for implementing the greedy approach—a straightforward implementation of the greedy approach is expensive and not scalable, so we describe a Monte Carlo algorithm.

304

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

Suppose h is the number of sensors needed for uniform coverage of P (as noted above, this value is unknown to us). For a given parameter ε > 0, set λ := λ(h, ε) = (h/ε)(1 + log (1 + ς)). We prove that one can choose a set L of m = O(λ log λ log (1/ε)) landmarks so that if S uniformly covers L, then S uniformly covers at least (1 − ε) fraction of the area of P . We refer to this result as the sampling theorem. ˜ log λ) sensors, with high probaNext we describe algorithms for computing O(h ˜ bility, that cover L, where h is the minimum number of sensors needed to cover L; ˜ ≤ h, but it can be much smaller. We show that a straightforward impleobviously h mentation takes O(nm2 h(1 + ς) log(mn) log m) expected time. The expected running time of our algorithm can be improved to O(nm2 (1 + ς) log(mn)). We then present a Monte Carlo algorithm, which, albeit having the same running time as the simpler greedy algorithm in the worst case, is faster in practice, as evident from our experimental results. If P is a simple region, i.e., its boundary is connected, the expected running time of the Monte Carlo algorithm reduces to O((mn + m2 )h log2 m log(mn)). Our overall approach is quite general and can be extended in many ways. For instance, it can incorporate other definitions of the coverage region of a sensor, such as when a sensor (e.g. camera) can cover the area only in a cone of directions; in this case the model is called the camera model. We can extend our sampling theorem to a weight setting: We are given a weight function w : P → R+ and we wish to ensure that at least (1 − ε) fraction of the weighted area of P is covered. This extension is useful because we have hot spots in many applications and we wish to ensure that they are covered. Finally, we extend our sampling theorem to non-uniform coverage—the number of landmarks increases in this case. In Section 2 we describe our landmark based approach, and prove the sampling theorem under either the finite sensing radius or the camera model. Section 3 describes the greedy algorithms. In Section 4, we briefly mention various extensions of our algorithm, including the weighted case and non-uniform coverage. In Section 5, we present experimental results with software implementation of the Monte Carlo algorithm, showing that our approach is useful in practice.

2 Landmark Based Algorithm for Uniform Coverage In this section we describe a landmark based algorithm for the uniform sensor-coverage problem under either the finite sensing radius or the camera model. We begin by reviewing a few concepts from random sampling and statistical learning theory [33]. Let X be a (possibly infinite) set of objects also referred to as the “space”, let μ be a measure function on X, and let R be a (possibly infinite) family of subsets of X, those subsets are called the “ranges”. The pair Σ = (X, R) is called a range space. A finite subset Y ⊆ X is shattered by Σ if {r ∩ Y | r ∈ R} = 2Y , i.e., the restriction of R to Y can realize all subsets of Y . The VC-dimension of Σ, denoted by VC-dim(Σ), is defined to be the largest size of a subset of X shattered by Σ; the VC-dimension is infinite if the maximum does not exist. Many natural range spaces have finite VC-dimension. In our context, let V = {V (x) | x ∈ P }, and let V = (P, V) be the range space. That is, the space is the polygon P and the ranges are all the coverage regions of the sensors

Efficient Sensor Placement for Surveillance Problems

305

located in P . When the sensing radius is infinite, the VC-dimension of V is known to be O(1 + log (1 + ς)) [36]. For a given ε > 0, a subset N ⊆ X is called an ε-net of Σ if r ∩ N

= ∅ for all r ∈ R such that μ(r) ≥ εμ(X). In other words, an ε-net is a hitting set for all the “heavy” ranges. A seminal result by Haussler and Welzl [22] shows that if VC-dim(Σ) = d, then a random subset N ⊆ X of size O((d/ε) log(d/εδ)) is an ε-net of Σ with probability at least 1 − δ. This bound has later been improved to O(d/ε log(1/ε)) by Koml´os, Pach, and Woeginger [26], who showed that a random sample of that size is an ε-net with constant probability1. We now mention a useful property of VC-dimension: If Σ1 = (X, R1 ), . . . , Σk = (X, Rk ) are k range spaces and VC-dim(Σi ) ≤ d for all i ≤ k, then the VC-dimension k of the range space Σ = (X, R), where R = { i=1 ri | r1 ∈ R1 , . . . , rk ∈ Rk }, is O(kd). Returning to our model, when the sensing radius r is finite, or each sensor can cover an area that is contained in a cone of directions, we define a new range space, so that the space remains P , and the ranges are modified as follows. In the first case, each range is the intersection of a coverage region V (x) (of an unbounded radius), x ∈ P , and a disc centered at x and having radius r. In the camera case, each range is the intersection of V (x) and a cone whose apex is in x (and whose opening angle corresponds to the camera “vision range”). It is well known that the range space VD = (P, D), where D is the set of all disks of radius r with centers at P , has a constant VC-dimension (see, e.g., [33]). Using the properties stated above, it follows that the VC-dimension of the range space corresponding to the finite sensing radius model is O(1 + log (1 + ς)). For the camera model, it can be shown in an analogous manner that the VC-dimension of the corresponding range space is O(1 + log (1 + ς)) as well, by observing that the VC-dimension of the range space VC = (P, C) is bounded by a constant, where C is the set of all cones whose apex lies in P . We now state and prove the main result of this section. Theorem 1 (Sampling Theorem). Let P be a polygonal region in R2 with ς holes, and let ε > 0 be a parameter. Suppose P can be covered by h sensors, each of which has a finite sensing radius r. Set λ := λ(h, ε) = (h/ε)(1 + ln (1 + ς)). Let L ⊂ P be a random subset of c1 λ ln λ ln (1/ε) points (landmarks) in P , and let S be a set of at most c2 h ln λ sensors that cover L, where c1 ≥ 1, c2 ≥ 1 are sufficiently large constants. Then S covers at least (1 − ε) fraction of the area of P with constant probability. The same asymptotic bounds hold for the camera model. If P is a simple polygon, then |L| = c1 h/ε ln (h/ε) ln (1/ε) and |S| ≤ c2 h ln (h/ε). Proof. We prove the theorem for the finite sensing radius model; the proof for the camera model proceeds almost verbatim. ¯ k = P \ k V (xi ) | x1 , . . . , xk ∈ P , that is, each Set k = c2 h ln λ. Let V i=1 ¯ k is the complement of the union of (at most) k coverage regions. Set V ¯k = range in V k ¯ ). Since VC-dim(V) = O(1 + log (1 + ς)), the above discussion implies that (P, V 1

In fact, the actual bound stated in [26] is slightly tighter, and the authors prove the existence of an ε-net of that size. We use the more conservative bound O(d/ε log(1/ε)), since the analysis in [26] shows that a random sample of that size yields an ε-net with constant probability.

306

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

¯ k ) = O(k(1 + log (1 + ς))) as complementation does not change the VCVC-dim(V dimension. Therefore if we choose a random subset L ⊂ P of size O((k/ε)(1 + ¯ k with constant probability. Let S be log (1 + ς)) log(1/ε)), then L is an ε-net of V a set satisfying the assumptions in the lemma, that is, S is a set of at most k sensors that cover L, i.e., L ⊂ V (S). Then, clearly, L ∩ (P \ V (S)) = ∅. By the definition of ε-net, μ(P \ V (S)) ≤ εμ(P ). Hence, S covers at least (1 − ε) fraction of the area of P . This completes the proof of the theorem. 2 The Sampling Theorem suggests the following simple approach, which we summarize by the procedure P LACE S ENSORS (P , ε), to compute a placement for P with an error threshold ε. Given a set L of landmarks, the procedure G REEDY C OVER(P , L), de˜ log |L|) (under either of scribed in the next section, computes a cover for L of size O(h ˜ the two above models), where h ≤ h is the number of sensors needed to cover L. Since ˜ is unknown to us, we apply an exponential search in order to approximate that h (or h) value up to a factor of 2 (see below at the procedure P LACE S ENSORS (P , ε)). P LACE S ENSORS (P , ε) I NPUT: Polygonal region P , error threshold ε O UTPUT: A placement of sensors covering ≥ (1 − ε)μ(P ) 1. i := 1 2. repeat 3. h := 2i , d := c2 (1 + ln (1 + ς)), k = c2 h ln (dh/ε), m := (c1 dk/ε) ln(1/ε) 4. L := m random points in P 5. S := G REEDY C OVER(P , L) // L ⊂ V (S) 6. i := i + 1 7. until (|S| ≤ k) ∧ (μ(V (S)) ≥ (1 − ε)μ(P )) Fig. 2. A landmark based algorithm for covering P

Remark: We note that the actual number of sensor placements that the algorithm computes is an approximation to the optimal number of sensors that cover L (and not necessarily the entire polygon P ), which may be much smaller than h.

3 The Greedy Algorithm In this section we describe the procedure G REEDY C OVER(P , L) that computes a placement of sensors to cover a set L ⊂ P of m landmarks under each of the two models defined above. We first describe a simple algorithm, which is a standard greedy algorithm, and then discuss how to expedite the greedy algorithm. For a point x ∈ P and a finite subset N ⊂ P , we define the depth of x with respect to N , denoted by Δ(x, N ), to be | V (x) ∩ N |, the number of points in N that lie in the coverage region of x. Set Δ(N ) = maxx∈P Δ(x, N ).

Efficient Sensor Placement for Surveillance Problems

307

Simple algorithm. The simple procedure computes the placement S incrementally. In the beginning of the i-th step we have a subset Li of mi sensors and a partial cover S; S covers L\Li . Initially, Li = L and S = ∅. At the i-th step we place a sensor at a location zi that covers the maximum number of landmarks in Li , i.e., Δ(zi , Li ) = Δ(Li ). We add zi to S, compute Li = V (zi ) ∩ Li , and set Li+1 = Li \ Li . The procedure stops when Li = ∅. A standard analysis for the greedy set-cover algorithm shows that ˜ ln |L| locations, where h ˜ is the number of sensors needed to the algorithm computes h cover L, see, e.g., [28]. The only nontrivial step in the above algorithm is the computation of zi . Note that Δ(x, Li ) = |{p ∈ Li | x ∈ V (p)}|, i.e., the number of coverage regions in V(Li ) that contain x. We compute the overlay A(Li ) of the coverage regions in V(Li ), which is a planar map; see Figure 1. It is known that A(Li ) has O(m2i n(1 + ς)) vertices, edges, and faces, and that A(Li ) can be computed in time O(m2i n(1 + ς) log n). Next, we compute the depth of every vertex in A(Li ), with respect to Li , within the same time bound. We then set zi to be the deepest vertex in A(Li ). The overall running time of the G REEDY C OVER procedure is O(m2 n(1 + ς)|S| log n). We note that it is unnecessary to compute A(Li ) in each step. Instead we can compute A(L) at the beginning and maintain A(Li ) and the depth of each of its vertices using a dynamic data structure; see [1]. The asymptotic running time now becomes O(m2 n(1 + ς) log2 n), which is faster if |S| ≥ log n. A Monte Carlo algorithm. We observe that it suffices to choose a point x at the i-th step such that Δ(x, Li ) ≥ Δ(Li )/2; it can be shown that the number of iterations then is still ˜ ln |L|) [1]. Using this observation, we can expedite the algorithm if Δ(Li ) is large, O(h as follows: We choose a random subset Ki ⊂ Li of smaller size (which is inversely proportional to Δ(Li )) and set zi to be a deepest point with respect to Ki . If the size of Ki is chosen appropriately, then Δ(zi , Li ) ≥ Δ(Li )/2, with high probability. More precisely, set Δ = Δ(Li ), and let q ≥ Δ/4 be an integer. We choose a random subset Ki ⊆ Li by choosing each point of Li with probability ρ = c1 ln(mi )/q, where c1 > 0 is a constant. Using the arguments in [2, 5], we can show that with probability at least 1 − 1/mi the following two conditions hold: (i) Δ ≥ q implies Δ(Ki ) ≥ 3qρ/2 = (3c1 /2) ln mi , and (ii) Δ ≤ q implies Δ(Ki ) ≤ 5qρ/4 = (5c1 /4) ln mi . An exponential search procedure that sets q = mi /2j at the jth step, chooses a subset Ki as described above, computes a deepest point xi in A(Ki ), and uses the above two conditions to determine whether return xi or half the value of q, can compute a point zi of depth at least Δ(Li )/2, with probability at least 1 − 1/mi . The expected running time of this procedure is O(n(mi /Δ)2 (1 + ς) log n log2 m). This procedure computes an overlay of mi /Δ coverage regions compared with that of mi regions in the previous procedure. Note that Δ(L) may be small, in which case the running time of this procedure is the same as that of the previous one. It is, however, faster in practice, as shown in Section 5. A crucial property that this approach relies on is the fact that the complexity of the boundary of the union of the coverage regions (that is, the vertices and edges of the overlay that are not contained in the interior of any regions) is significantly smaller than the complexity of the entire overlay. Specifically, Gawali et al. [17] showed that, if the boundary of P is connected and the sensing radius r is infinite, the overall number of

308

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

vertices and edges that appear along the boundary of the union of the regions in V (L) is only O(mn + m2 ). In the full version of this paper we prove similar asymptotic bound for the case where r is finite, or when the coverage region of each sensor is bounded by a cone of directions. In this case, the running time of the overall G REEDY C OVER procedure becomes O((m/Δ)n + (m/Δ)2 log m)|S| log2 n), where Δ = Δ(L). Furthermore, by using a combination of the previous and the Monte Carlo procedures, the running time can be further improved for this special case. This, however, does not necessarily hold for a polygon with holes, in which case the union complexity is Ω(m2 nς) in the worst case; we provide the lower bound construction in the full paper. Remark: The greedy algorithm to cover L can be replaced by the method of Br¨onnimann and Goodrich [8] improving the approximation factor to O(log h), but with an increase in the running time; see [1].

4 Extensions We briefly describe the following extensions and omit any further details in this version. Weighted coverage. We are now given a weight function w : P → R+ . The weighted area of P is defined to be μw (P ) = x∈P w(x)dx, and is normalized to be 1. The weighted area of a coverage region V (l) for a location l ∈ P , μw (V (l)), is x∈V (l) w(x)dx. The goal is to find a placement S of sensors such that μ(V (S)) ≥ (1−ε)μ(P ). This extension is useful in practice as it enables us to highlight “hot-spots” in the environment. Since the random sampling theory is applicable to the weighted setting, the above algorithm and the analysis extends to this case verbatim. Multiple coverage. In order to be resilient to sensor failures, we now require that each point p ∈ P to be covered by at least t distinct sensors, for a given parameter t ≥ 1. The definition of the range-space becomes more complex: For a set X ⊂ P of k points, we now define V¯t (X) ⊆ V (X) to be the set of points whose depth is less ¯ k = {V¯t (X) | X ⊂ P is a set of k points} and V ¯ k = (P, V¯ k ). We prove than t. Set V t t t k t 2 ¯ VC-dim(Vt ) = O(k t (1 + log(1 + ς)) log (kt(1 + log (1 + ς)))), which in turn leads to a bound on the number of landmarks. Next, we extend the greedy algorithm to find a placement of sensors so that each landmark is covered by at least t sensors.

5 Experiments In this section we present experimental results showing that our approach is also useful in practice. We have implemented both simple and Monte Carlo algorithms for the uniform-coverage problem under the finite sensing radius model. Implementation. The algorithm presented in Section 2 can be simplified as described in Figure 3 — it has the advantage that we do not have to estimate constants c1 , c2 , and d; and in practice we need to choose much fewer landmarks. Our experimental results indicate that this implementation has good performance in practice—see below.

Efficient Sensor Placement for Surveillance Problems

309

P LACE S ENSORS (P , ε) I NPUT: Polygonal region P , error threshold ε O UTPUT: A placement of sensors covering ≥ (1 − ε)μ(P ) 1. m := 16 2. repeat 3. L := m random points in P 4. S := G REEDY C OVER(P , L) // L ⊂ V (S) 5. m := 2m 6. until μ(V (S)) ≥ (1 − ε)μ(P ) Fig. 3. The implementation for the landmark based algorithm

Bouncy X−monotone Orthogonal General

Number of guards

20

15

10

5 0

0.05

0.1

0.15

0.2

0.25

Epsilon

(a)

(b)

Fig. 4. (a) A quadtree representing a polygon. The dots are the landmarks. (b) The (averaged) number of sensors as a function of ε for each of the four data sets.

Recall that G REEDY C OVER requires computing the overlay of the coverage region of the landmarks. Computing the exact overlay is, however, complicated, inefficient in practice, and suffers from robustness issues, see [20] for a detailed review about the various difficulties that arise in practice. For simplicity, we thus resort to computing the coverage regions and their overlays approximately using quadtrees (see, e.g., [12]). Since we rely on a landmark based approach and we allow a small fraction of P not being covered by the sensors, we believe this approximation is reasonable. Given the input polygon P , we recursively partition the plane into four quadrants, each of which contains a portion of the boundary of P (in fact, this portion contains at most half of the vertices of P ). The recursion bottoms out when the following three conditions are satisfied: We encounter to a cell that (i) contains at most a single polygon vertex, (ii) meets at most two edges of the polygon, (iii) has an area that is sufficiently small with respect to the polygon area. The cells that are generated at the bottom of the recurrence induce an approximated partition of the polygon into axis-parallel rectangles, which we also refer to as pixels; see Figure 4(a). This approximate representation improves as we refine the decomposition. Having the pixels at hand, we represent each of the coverage regions induced by the landmarks as the corresponding subset of these pixels. In fact, this representation is performed implicitly by computing, for each pixel C, the set of the landmarks that are visible to C, and then setting the depth of C to be

310

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

(a)

(b)

(c)

(d)

Accumulated covered area

Accumulated covered area

Fig. 5. The data sets. (a) Bouncy. (b) x-monotone. (c) Orthogonal. (d) General.

1

0.8

0.6

0.4

0.2

Monte Carlo Exact Greedy

0

2

4

6

8

10

1

0.8

0.6

0.4

0.2

0

12

Monte Carlo Exact Greedy 1

2

3

4

Number of sensors

5

6

7

8

9

10

11

12

Number of sensors

(a)

(b)

Accumulated covered area

Accumulated covered area

Fig. 6. The (averaged) accumulated covered area (y-axis) at each iteration (x-axis) for the Exact Greedy and the Monte Carlo algorithms, for the bouncy and x-monotone data sets

1

0.8

0.6

0.4

0.2

0

Monte Carlo Exact Greedy 2

4

6

8

10

12

14

Number of sensors

(a)

16

18

20

1

0.8

0.6

0.4

0.2

0

Monte Carlo Exact Greedy 2

4

6

8

10

12

14

Number of sensors

(b)

Fig. 7. The (averaged) accumulated covered area (y-axis) at each iteration (x-axis) for the Exact Greedy and the Monte Carlo algorithms, for the orthogonal and general data sets

the number of these landmarks. When we apply the greedy algorithm, we update this data accordingly. When the algorithm terminates, the depth of each pixel C becomes 0. Technically, the notion of visibility of a landmark s and a pixel C is not well-defined. In our implementation, we report that s sees C if s sees at least a fixed set of sample points in C ∩ P .

Efficient Sensor Placement for Surveillance Problems

311

Input sets. We next describe our data sets. Our input consists of both random polygons and manually produced polygons, which are interesting as well as challenging in the context of our problem. For the random polygons, we use a software developed by Auer and Held [6] for generating random simple polygons, and for the manually produced polygons, we use the inputs of Amit et al. [4]. Out data sets are listed, as follows: – Bouncy (Figure 5(a)): A random bouncy polygon with 512 vertices. – x-monotone (Figure 5(b)): A random x-monotone polygon with 512 vertices. – Orthogonal (Figure 5(c)): An orthogonal polygon with 20 vertices and 4 holes, each of which has (additional) 6 vertices. – General ( Figure 5(d)): A general polygon with 16 vertices and 4 holes with (additional) overall 16 vertices. The orthogonal and general data sets represent urban environment. The first two data sets (bouncy and x-monotone) allow us to test the worst-case behavior of the algorithm, where the number of vertices is relatively large, and the coverage regions have long and skinny spikes—a property that follows from the structure of these polygons. Each of these polygons is bounded by a square with a side-length of 600 units and thus the area of each of these polygons is bounded from above by 36 · 104 . For the orthogonal and general data sets we choose a radius of length 100, where for the bouncy and x-monotone data sets we choose that length to be 104 (to reduce the influence of the sensing radius and letting the coverage regions be long and skinny). Results. We present experimental results, applying both the simple greedy algorithm presented in Section 3 (which we also refer to as the Exact Greedy algorithm) and the Monte Carlo algorithm on each of our data sets. We measure various parameters, each of which is a yardstick for measuring and comparing the performance of the algorithms. Given a fixed ε > 0, these parameters are listed, as follows: (i) Number of visibility tests that both algorithms perform. In each of these tests we need to report whether a landmark l sees a pixel C; the overwhelming majority of the running time is determined by the overall number of calls to this procedure, as, in our experiments, the preprocessing time for constructing the quadtree is negligible compared with the overall running time of the various visibility tests. (ii) Number of landmarks chosen by the algorithm. (iii) Number of iterations required in order to obtain a coverage of (1 − ε) of the polygon; as stated in Section 3, at each iteration we choose a single sensor placement, and thus the above measure corresponds to the overall number of sensor placements chosen by the algorithm. (iv) The convergence ratio of the algorithm; at each iteration i we measure the ratio between the overall area just covered (over all iterations 1, . . . , i) and the entire polygon area. For each specific input set, we ran each algorithm ten times, and the results reported below are the averages over each of the above specified parameters. In all our experiments, we set ε = 0.05. In all our experiments the Monte Carlo algorithm performs better than the Exact Greedy algorithm. Specifically, the expected number of visibility tests performed by the Monte Carlo algorithm is smaller (and sometimes significantly smaller) than that of the Exact Greedy algorithm. This is because in the Monte Carlo algorithm we compute the overlay of a relatively small subset of the coverage regions. In our implementation

312

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

Table 1. The (averaged) number of landmarks m and visibility tests τ for the Exact Greedy and the Monte Carlo algorithms, for each of the data sets Data set Parameter Bouncy x-monotone Orthogonal General

Exact Greedy m τ 499 848162 243 450259 256 6505115 76 147326

Monte Carlo m τ 83 288905 108 202714 56 95969 44 117926

this implies that in order to find a pixel of maximum depth, we perform our visibility tests only with respect to the (random) subset of landmarks chosen by the Monte Carlo algorithm; see Table 1. The number of landmarks in each of our experiments is small, as expected. Moreover, due to our implementation, the number of landmarks m is estimated directly by an exponential search (Figure 3) and not according to the actual number of sensor placements and ε (Figure 2), which assigns a relatively large initial value for m that corresponds to the worst-case bound on the ε-net size. Interestingly, our experiments indicate that the number of landmarks is also smaller under the Monte Carlo algorithm; see once again Table 1. This improvement is perhaps achieved due to the property that we perform a second sampling step over the set of the landmarks in the Monte Carlo algorithm, in order to locate a point that covers a largest number of landmarks (approximately). In this case, it is likely that the resulting sensor placements have a better deployment than that achieved by the Exact Greedy algorithm; but a more detailed study is needed to understand the behavior of the two algorithms. The number of iterations of the Monte Carlo algorithm cannot be smaller than that of the Exact Greedy algorithm by definition (see Section 3). Nevertheless, our experiments show (see Figures 6, 7) that these values are roughly similar for all our data sets. Moreover, in all our experiments, the number of iterations of the Monte Carlo algorithm is less than 1.5 of that of the Exact Greedy algorithm. The convergence ratio for both types of random polygons is rapid, as indicated by our experiments, at the very first iterations, both algorithms manage to cover most of the polygon. This does not necessarily hold for the polygons with holes (Figure 7), since it is difficult to find a single sensor that covers most of the polygon (according to our findings, this is also the case when the sensing radius is significantly larger), and thus any algorithm that aims to cover these polygon has to use a relatively large number of sensors; see once again Figure 5(c)–(d). In addition to the results reported above, we have tested the dependency of the number of sensor placements in ε. As one may expect, the number of sensors should decrease as the error-parameter ε increases. Moreover, in our implementation (Figure 3) we double the number of landmarks m, each time we fail to cover (1 − ε) of the polygon, thus, when the number of landmarks increases exponentially (as ε tends to 0), it is likely to have a super-linear growth in the number of sensors needed to cover them. In Figure 4(b) we present these results for each of the data sets under the Monte Carlo algorithm. As indicated by our experiments, the number of sensors super-linearly decreases as ε grows.

Efficient Sensor Placement for Surveillance Problems

313

6 Concluding Remarks In this paper we presented a landmark based approach to place a set of sensors. The main contribution of the paper is that a small number of landmarks are sufficient to guide the placement of sensors. We are currently addressing several important issues that are not included in this paper: We are developing an algorithm to add a few relay nodes so that the sensors can communicate with each other (see also [13]). We are also extending our algorithms for covering 3D environments.

References 1. Agarwal, P.K., Chen, D.Z., Ganjugunte, S.K., Misiołek, E., Sharir, M.: Stabbing convex polygons with a segment or a polygon. In: Proc. 16th European Symp. Algorithms, pp. 52–63 (2008) 2. Agarwal, P.K., Hagerup, T., Ray, R., Sharir, M., Smid, M.H.M., Welzl, E.: Translating a planar object to maximize point containment. In: Proc. 10th European Symp. Algorithms, pp. 42–53 (2002) 3. Ahmed, N., Kanhere, S.S., Jha, S.: The holes problem in wireless sensor networks: a survey. ACM SIGMOBILE Mobile Comput. and Commun. Review 9(2), 4–18 (2005) 4. Amit, Y., Mitchell, J.S.B., Packer, E.: Locating guards for visibility coverage of polygons. In: Proc. Workshop Alg. Eng. Exp., pp. 120–134 (2007) 5. Aronov, B., Har-Peled, S.: On approximating the depth and related problems. In: Proc. 16th Annu. ACM-SIAM Symp. on Disc. Alg., pp. 886–894 (2005) 6. Auer, T., Held, M.: Heuristics for the generation of random polygons. In: Proc. 8th Canad. Conf. Comput. Geom., pp. 38–44. Carleton University Press (1996) 7. Bai, X., Kumar, S., Yun, Z., Xuan, D., Lai, T.H.: Deploying wireless sensors to achieve both coverage and connectivity. In: Proc. 7th ACM MobiHoc, pp. 131–142 (2006) 8. Br¨onnimann, H., Goodrich, M.: Almost optimal set covers in finite VC-dimension. Discrete Comput. Geom. 14, 463–479 (1995) 9. Chakrabarty, K., Iyengar, S.S., Qi, H., Cho, E.: Grid coverage for surveillance and target location in distributed sensor networks. IEEE Trans. Computers 51(12), 1448–1453 (2002) 10. Cheong, O., Efrat, A., Har-Peled, S.: Finding a guard that sees most and a shop that sells most. Discrete Comput. Geom. 37(4), 545–563 (2007) 11. Dhillon, S., Chakrabarty, K.: Sensor placement for effective coverage and surveillance in distributed sensor networks. In: Proc. IEEE Wireless Communications Network. Conf., pp. 1609–1614 (2003) 12. de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications. Springer, Berlin (2000) 13. Efrat, A., Fekete, S.P., Gaddehosur, P.R., Mitchell, J.S.B., Polishchuk, V., Suomela, J.: Improved approximation algorithms for relay placement. In: Proc. 16th European Symp. Algorithms, pp. 356–357 (2008) 14. Efrat, A., Har-Peled, S.: Guarding galleries and terrains. In: 2nd IFIP Internat. Conf. Theo. Comp. Sci., pp. 181–192 (2002) 15. Efrat, A., Har-Peled, S., Mitchell, J.S.B.: Approximation algorithms for two optimal location problems in sensor networks. In: 2nd Internat. Conf. on Broadband Networks, vol. 1, pp. 714–723 (2005) 16. Eidenbenz, S., Stamm, C., Widmayer, P.: Inapproximability results for guarding polygons and terrains. Algorithmica 31(1), 79–113 (2001)

314

P.K. Agarwal, E. Ezra, and S.K. Ganjugunte

17. Gewali, L., Meng, A., Mitchell, J.S.B., Ntafos, S.: Path planning in 0/1/∞ weighted regions with applications. ORSA J. Computing 2, 253–272 (1990) 18. Gonz´alez-Ba´nos, H.H., Latombe, J.C.: A randomized art-gallery algorithm for sensor placement. In: Proc. 16th ACM Symp. Comput. Geom., pp. 232–240 (2000) 19. Guestrin, C., Krause, A., Singh, A.P.: Near-optimal sensor placements in Gaussian processes. In: Proc. 22th Int. Conf. Machine Learn., pp. 265–272 (2005) 20. Halperin, D.: Robust geometric computing in motion. International Journal of Robotics Research 21(3), 219–232 (2002) 21. Hefeeda, M., Bagheri, M.: Randomized k-coverage algorithms for dense sensor networks. In: Proc. of 26th IEEE Intl. Conf. on Comp. Commn., pp. 2376–2380 (2007) 22. Haussler, D., Welzl, E.: ε-nets and simplex range queries. Discrete Comput. Geom. 2, 127– 151 (1987) 23. Huang, C.F., Tseng, Y.C.: The coverage problem in a wireless sensor network. In: Proc. ACM Workshop Wireless Sensor Networks Appl. (WSNA), pp. 115–121 (2003) 24. Iyengar, R., Kar, K., Banerjee, S.: Low-coordination topologies for redundancy in sensor networks. In: Proc. 6th ACM MobiHoc, pp. 332–342 (2005) 25. Kershner, R.: The number of circles covering a set. American Journal of Mathematics 61(3), 665–671 (1939) 26. Koml´os, J., Pach, J., Woeginger, G.: Almost tight bounds for epsilon nets. Discrete Comput. Geom. 7, 163–173 (1992) 27. Latombe, J.-C.: Robot Motion Planning. Kluwer Academic Publishers, Boston (1991) 28. Lov´asz, L.: On the ratio of optimal integral and fractional covers. Discrete Mathematics 13, 383–390 (1975) 29. McLurkin, J., Smith, J., Frankel, J., Sotkowitz, D., Blau, D., Schmidt, B.: Speaking swarmish: Human-robot interface design for large swarms of autonomous mobile robots. In: Proc. AAAI Spring Symp. (2006) 30. Meguerdichian, S., Koushanfar, F., Potkonjak, M., Srivastava, M.B.: Coverage problems in wireless ad-hoc sensor networks. In: Proc. 20th Annu. Joint Conf. IEEE Comput. Commun. Soc., pp. 1380–1387 (2001) 31. Meguerdichian, S., Koushanfar, F., Qu, G., Potkonjak, M.: Exposure in wireless ad-hoc sensor networks. In: Proc. 7th Annu. Intl. Conf. on Mobile Computing and Networking, pp. 139–150 (2001) 32. Rajagopalan, S., Vazirani, V.V.: Primal-dual RNC approximation algorithms for set cover and covering integer programs. SIAM J. Comput. 28(2), 525–540 (1999) 33. Pach, J., Agarwal, P.K.: Combinatorial Geometry. Wiley-Interscience, San Diego (1995) 34. Tian, D., Georganas, N.D.: A coverage-preserving node scheduling scheme for large wireless sensor networks. In: Proc. ACM Workshop Wireless Sens. Nets. Appl (WSNA), pp. 32–41 (2002) 35. Urrutia, J.: Art Gallery and illumination problems. In: Sack, J., Urrutia, J. (eds.) Handbook of Computational Geometry, pp. 973–1027. Elsevier, Amsterdam (2000) 36. Valtr, P.: Guarding galleries where no point sees a small area. Israel J. Math. 104, 1–16 (1998) 37. Wang, Y.C., Hu, C.C., Tseng, Y.C.: Efficient deployment algorithms for ensuring coverage and connectivity of wireless sensor networks. In: Proc. IEEE Wireless Internet Conference (WICON), pp. 114–121 (2005) 38. Zhao, F., Guibas, L.J.: Wireless Sensor Networks: An Information Processing Approach. Morgan Kaufmann, San Francisco (2004)

Local Construction of Spanners in the 3-D Space Iyad A. Kanj1 , Ge Xia2 , and Fenghui Zhang3 1

School of CTI, DePaul University, 243 S. Wabash Avenue, Chicago, IL 60604, USA [email protected] 2 Department of Computer Science, Lafayette College, Easton, PA 18042, USA [email protected] 3 Google Seattle, 651 N. 34th Street, Seattle, WA 98103, USA [email protected]

Abstract. In this paper we present local distributed algorithms for constructing spanners in wireless sensor networks modeled as unit ball graphs (shortly UBGs) and quasi-unit ball graphs (shortly quasi-UBGs), in the 3dimensional Euclidean space. Our ﬁrst contribution is a local distributed algorithm that, given a UBG U and a parameter α < π/3, constructs a sparse spanner of U with stretch factor 1/(1 − 2 sin (α/2)), improving the previous upper bound of ofer et al. which is applicable √ 1/(1 − α) by Alth¨ only when α < 1/(1+2 2) < π/3. The second contribution of this paper is in presenting the first local distributed algorithm for the construction of bounded-degree lightweight spanners of UBGs and quasi-UBGs. The simulation results we obtained show that, empirically, the weight of the spanners, the stretch factor and locality of the algorithms, are much better than the theoretical upper bounds proved in this paper.

1

Introduction

In this paper we consider some fundamental topology control problems that are essential for communication in wireless sensor networks. Two devices in such networks can, in principle, communicate if they are in each other’s transmission range. When studying these networks, it is natural to embed them in a Euclidean metric space. A common simple embedding assumes that the space is two dimensional, and that the transmission range of all devices is the same. In that case, the network is modeled as a Unit Disk Graph, abbreviated UDG, deﬁned as the subgraph of the Euclidean graph—the complete graph on the same set of points—consisting of edges of length at most 1 unit. In practice, however, the UDG model may be too idealistic for wireless sensor networks. The quasi unit disk graph, shortly quasi-UDG, has been proposed to capture the non-uniform characteristics of wireless sensor networks [2]. Let 0 ≤ r ≤ 1 be a constant. A quasi unit disk graph, abbreviated quasi-UDG henceforth, with parameter r in the Euclidean plane is deﬁned as follows: for any two points X and Y in the graph, XY is an edge if |XY | ≤ r, and XY is not an edge in the graph if |XY | > 1. If r < |XY | ≤ 1 then XY may or may not be an edge in the graph; it is usually assumed that an adversary decides the placement of such edges. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 315–328, 2009. c Springer-Verlag Berlin Heidelberg 2009

316

I.A. Kanj, G. Xia, and F. Zhang

For an integer d ≥ 3, the Unit Ball Graph abbreviated UBG, is a straightforward generalization of the UDG to Euclidean spaces of dimension d: two points in a UBG are connected if and only if their Euclidean distance in the d-dimensional Euclidean space is at most 1 unit. Similarly, the d-dimensional quasi Unit Ball Graph, abbreviated quasi-UBG henceforth, is a straightforward generalization of a quasi-UDG. A distributed algorithm is said to be k-local [14,16] if, “intuitively”, the computation at each point of the graph depends solely on the initial states of the points at distance at most k from the point (i.e., within k hops from the point). More formally, a distributed algorithm is k-local if it can be simulated to run in at most k synchronous communication rounds for some integer parameter k > 0. An algorithm is called local if it is k-local for some integer constant k. Eﬃcient local distributed algorithms are naturally fault-tolerant, robust, and scalable. Therefore, it is natural to seek such algorithms especially for solving problems in wireless sensor networks. In this paper we consider the construction of sparse spanners and lightweight sparse spanners by local distributed algorithms in wireless sensor networks modeled as UBGs and quasi-UBGs in the 3-dimensional Euclidean space. A spanner of a UBG (or quasi-UBG) G is a subgraph H of G such that for every two points A, B ∈ G, the weight of a shortest path between A and B in H is at most ρ times the weight of a shortest path between A and B in G, where ρ is a positive constant called the stretch factor of H (with respect to G). Since we are concerned with geometric spanners in this paper, the weight of an edge AB is deﬁned to be the Euclidean distance between points A and B, that is wt(AB) = |AB|, and the weight of H, denoted by wt(H), is the weight of all edges in H. It is easy to see that a connected UBG U on S contains a Euclidean Minimum Spanning Tree (EMST) of S. A spanning subgraph of U is said to have low weight, or to be lightweight, if its weight is at most c · wt(EMST) for some constant c. Spanners and lightweight spanners are fundamental to communication in wireless sensor networks because they represent topologies that can be used for eﬃcient unicasting and/or broadcasting. We restrict our attention to the 3-dimensional Euclidean space, both for practical and illustrative purposes, even though our techniques can be generalized in a straightforward manner to higher dimensional Euclidean spaces. So far, and to the best of our knowledge, no local distributed algorithm has been developed for the construction of sparse spanners and lightweight sparse spanners for UBGs and quasi-UBGs in the 3-dimensional Euclidean space. Li, Song, and Wang [13] presented a local algorithm that constructs a planar, bounded-degree, lightweight power spanner of a UDG. Although the algorithm in [13] can be extended to construct power spanners of UBGs, their techniques do not apply to geometric spanners. In power spanners the weight of an edge AB is assumed to be |AB|β , where |AB| is the Euclidean distance between A and B, and β is a real constant between 2 and 5. As a consequence, in power spanners it is possible to ﬁnd a path between two points A and B whose total weight is smaller than the weight of the straight-line edge AB, and this is impossible

Local Construction of Spanners in the 3-D Space

317

in a geometric spanner. Damian Pandit, and Pemmaraju [6] gave a distributed algorithm that constructs a spanner of a quasi-UBG of bounded degree, light weight, and arbitrarily small stretch factor. The distributed algorithm in [6] runs in poly-logarithmic number of rounds, and hence is not local. Althh¨ ofer et al. [1] showed that, for any d-dimensional (d ≥ 3) space and any parameter √ α < 1/(1 + 2 2), there exists a sparse spanner of any Euclidean graph with stretch factor 1/(1 − α). However, although the number of edges in the spanner is linear, no explicit upper bound on this number was given. Moreover, the result in [1] does not present an algorithm for computing such a sparse spanner. We note that most of the earlier work on the construction of sparse spanners of Euclidean graphs in higher dimensional spaces was done from the computational geometry perspective, and under the centralized model of computation (see [1,3,8,12], to name a few). Adapting the techniques used in this early work to the local model is a challenge, because most of these techniques rely on a centralized greedy algorithm that requires sorting the edges of the graph. The result in [6] attempted to make this algorithm local using clustering techniques, but only managed to obtain a distributed algorithm with a polylogarithmic number of rounds. The ﬁrst contribution of the paper is in presenting the first local distributed algorithms for constructing sparse spanners of UBGs and quasi-UBGs. For example, we present a local distributed algorithm that, given a UBG U and a parameter α < π/3, constructs a sparse spanner of U with stretch factor 1/(1 − 2 sin (α/2)), improving the previous upper bound of 1/(1 − α) by Alth¨ ofer et al. [1] (note that the Euclidean graph is a special case of the UBG). Moreover, whereas the upper bound on the average degree of the spanner in [1] is asymptotic, we give an explicit upper bound on the average degree of the spanner constructed by our algorithm. For example, for α = π/4, our results imply a local distributed algorithm that constructs a sparse spanner of a UBG with stretch factor 4.3 and average degree at most 45. The second contribution of this paper is in presenting the first local distributed algorithm for constructing bounded-degree lightweight spanners of UBGs and quasi-UBGs. For example, we present a local distributed algorithm that, given parameters α < π/3 and λ > 1, constructs a lightweight bounded-degree spanner of a given UBG of stretch factor 3λ8 /(1−2 sin (α/2)). These results imply a local distributed algorithm for constructing bounded-degree lightweight spanners of a UBG with stretch factor 4.6. We performed extensive simulations to empirically study the performance of our algorithms. The simulation results we obtained show that, empirically, the upper bounds on the stretch factor of the spanner we constructed and locality of the algorithm, are much better than the theoretical upper bounds proved in this paper. This suggests that the algorithms perform very well in practice. Each point in the local distributed algorithms presented in this paper starts by collecting the IDs and coordinates of its k-hop neighbors for some ﬁxed k; then it performs only local computations afterwards. For a ﬁxed k, it was shown in [10] that the k-hop neighborhoods of the points in a UBG or a quasi-UBG U

318

I.A. Kanj, G. Xia, and F. Zhang

can be computed by a local distributed algorithm in which the total number of messages sent is O(n), where n = |V (U )|. Therefore, the message complexity of the k-local distributed algorithms in this paper is O(n). Most of our techniques presented in this paper can be generalized in a straightforward manner to higher dimensional Euclidean spaces.

2

Sparse Spanners

In this section we present local distributed algorithms for constructing sparse spanners of UBGs and quasi-UBGs in the 3-D Euclidean space. We start by treating the case when U is a UBG, and then indicate how to handle the more general case of when U is a quasi-UBG. 2.1

Spanners for UBGs

Our approach for constructing spanners locally is based on the Yao subgraphs [18]. Given an angle α < π/3, the Yao subgraph [18] of a graph G embedded in the plane is constructed as follows. At every point M in G, place 2π/α equally-separated rays out of M (arbitrarily deﬁned), thus creating 2π/α closed cones of angle α each. Then, the shortest edge in G out of M (if any) in each cone is added to the Yao subgraph of G. Whereas the Yao subgraphs can be constructed easily in the Euclidean plane, their construction in higher dimensional Euclidean spaces is far from being easy. Yao, in his seminal work [18], described how a set of edges, called a frame, incident on a point can be computed so that the angle between any other edge incident on the point and an edge from this set is bounded by a pre-speciﬁed constant. However, even though the size of this set (i.e., the frame) is a constant that depends on the dimension of the underlying Euclidean space, no speciﬁc upper bound on this constant was given, and the construction of this set uses very sophisticated techniques, such as the first barycentric subdivision of a simplex. Althh¨ ofer et al. [1] also used the Yao subgraphs to show the existence of sparse spanners of Euclidean graphs in higher dimensional spaces. In [1], it was √ shown that, for any d-dimensional (d > 2) space and any angle α < 1/(1 + 2 2), for every point M , there exists a constant-size set of cones with apex at M and angle α, such that the subgraph of the Euclidean graph E consisting of the shortest edge in each of these cones, is a sparse spanner of E with stretch factor 1/(1 − α). However, even though the number of edges in the spanner is linear, no explicit upper bound on this number was given. Moreover, the results in [1] prove the existence of such a set of cones, but does not specify how the cones can be constructed, which again, makes this result of a theoretical nature, rather than a practical one. We shall improve on the aforementioned results from several aspects, including the upper bounds on the angle, the stretch factor, and the cardinality of the set of cones. We start by showing how a set of edges incident on a point in U can be constructed easily so that every other edge incident on the point is within a

Local Construction of Spanners in the 3-D Space

319

pre-speciﬁed angle α from one of the edges of this set.1 Let α < π/3 be a given angle, and let M be a point in U . Let SM be the sphere centered at M and of radius 1, and note that all neighbors of M in U lie inside SM . An analogous approach in 3-D to the construction of the Yao subgraph in the plane would be to cover the ball inside SM with 3-D cones whose apices are at M , and then to choose the shortest edge incident on M in every cone. However, a (perfect) covering of the ball with cones in general is not possible/feasible. To overcome this problem, we instead utilize an inﬁnite class of Fullerene graphs2 that are embedded in the sphere SM . A Fullerene graph is a 3-regular simple planar graph with hexagon or pentagon faces, and the number of pentagon faces in a Fullerene graph is always 12. Let h be the number of hexagon faces in a Fullerene graph. We have the following fact. Fact 1 ([15]). For any natural number h = 1, there is a Fullerene graph with 12 pentagon faces and h hexagon faces. Eﬃcient algorithms that run in linear time [4] exist for generating and representing Fullerene graphs, and embedding them in the sphere such that all the faces are regular (i.e., all faces are equiangular and equilateral). Let’s call such an embedding a regular embedding. Unless otherwise indicated, we will assume that the embeddings discussed in this section are regular. Lemma 1. Let G be a Fullerene graph embedded in the sphere SM . Let l be the length of the edges in G, and let h be the number of hexagons in G. Then h ≤ 4.837l−2 − 0.662. Proof. Let A6,l be the surface area enclosed by a regular hexagon with side length l embedded in SM . Since A6,l is√ greater than the area of the same hexagon in the plane, we have A6,l > 3 3l2 /2 > 2.598l2. Similarly, the surface area enclosed by a regular pentagon with side length l embedded in SM is A5,l > 5 tan(3π/10)l2 /4 > 1.72l2. Since h · A6,l + 12 · A5,l = 4π, we have 2.598hl2 + 1.72l2 < 4π, and hence h ≤ (4π − 1.72l2)/(2.598l2) ≤ 4.837l−2 − 0.662.

Consider the set S of pyramids formed by joining the center of the sphere M to the vertices of every face of the Fullerene graph G embedded in the sphere SM . The set S partitions SM . By a simple geometric argument, it is easy to verify that for any two points A and B in the same pyramid in S, the angle AM B is at most 2 · arcsin(l). Therefore, by setting l = sin(α/2), we can ensure that AM B ≤ α, and the number of pyramids is h + 12 ≤ 4.837(sin(α/2))−2 − 0.662 + 12 ≤ 4.837(sin(α/2))−2 + 11.338. 1 2

Note that the value of α should be properly selected to balance the sparseness and the stretch factor of the spanner. Fullerene graphs generalizes the famous carbon-60 “buckyball” that was ﬁrst discovered in [11]. The computation of Fullerene graphs has been studied in the context of computational chemistry, chemical modeling and graph theory, but to the best of our knowledge, no prior work has considered it as a generalization of Yao-subgraphs.

320

I.A. Kanj, G. Xia, and F. Zhang

Lemma 2. Let α < π/3 be given. Then for every point M in the UBG U , there exists a set of edges EM incident on M with cardinality |EM | ≤ 4.837(sin(α/2))−2 + 11.338, such that for every edge M N incident on M in U , there exists an edge M N incident on M in EM satisfying |M N | ≤ |M N | and N M N ≤ α. Moreover, the set EM is computable by point M in O(n) time, where n = |V (U )|. Proof. The set of edges EM is computed as follows. We ﬁrst compute an embedded Fullerene graph G where each edge has length α/2. For each face R in G, the maximum angle in the pyramid of apex M and base R is bounded by α. In each nonempty pyramid, the shortest edge incident on M is added to EM . It is clear from the previous discussion that the set EM satisﬁes the condition that: for every edge M N incident on M in U , there exists an edge M N incident on M in EM with |M N | ≤ |M N | and N M N ≤ α. The cardinality of EM is the total number of pyramids, or equivalently the total number of faces in G, which is |EM | ≤ h + 12 ≤ 4.837(sin(α/2))−2 + 11.338, as desired. Finally, the set EM can be computed in O(n) time by point M , as follows. First, an embedded Fullerene graph can be computed in O(n) time [4]. Then the at most n− 1 edges incident on M can be mapped to the pyramid they lie in, in O(n) time. Finally, in each pyramid the shortest edge can be chosen in time proportional to the number of edges within the pyramid, and hence in O(n) time in total.

Given a UBG U and an angle α < π/3, each point M ∈ U performs the following distributed algorithm, referred to as 3-D Spanner. (i) M sends its coordinates to all its neighbors in U ; (ii) M computes the set EM as described above and selects the edges in EM ; M notiﬁes every point N such that M N ∈ EM about its selection of the edge M N ; (iii) Upon receiving a message from a neighbor N , M decides the status of the edge M N as follows: M N ∈ G if and only if M N has been chosen by either M or N ; if the status of the edge M X has been determined for every neighbor X of M in U then M ﬁnishes processing. Theorem 2. For every angle α < π/3, the algorithm 3-D Spanner is a 3-local distributed algorithm that constructs a sparse spanner G of U with stretch factor ρ = 1/(1 − 2 sin (α/2)). Moreover, the algorithm runs in O(n) local time (at each point in U ). Proof. It is clear that the algorithm 3-D Spanner runs in 3 rounds, and hence is a 3-local distributed algorithm. Since for every M in U the cardinality of EM is a constant, given in Lemma 2, which only depends on the angle α, and hence is independent of the point M , and since each point M selects an edge M N if and only if M N is selected by M or its selected by N , it follows that the total number of edges in G is bounded by c · n, where constant c is the upper bound on EM given in Lemma 2. This shows that G is sparse.

Local Construction of Spanners in the 3-D Space

321

To show that G is a spanner of U with the desired stretch factor, it suﬃces to show that for any edge M N in U , there is a path from M to N in G of weight at most ρ|M N |. Note that this will also establish the connectivity of G. For a pair of points (M, N ) in U where M and N are distinct, deﬁne the rank of the pair (M, N ) to be the number of pairs (M , N ), where M , N , ∈ U , such that |M N | < |M N |. Since there are O(n2 ) pairs of points in U , and hence O(n2 ) ranks, we can proceed by induction on the rank of the pair (M, N ). The base case is when the rank of (M, N ) is 0, that is, when M and N are a closest pair of points in U . In this case edge M N must be picked in G, and hence there is a path from M to N in G of stretch factor 1. To see why the previous statement is true, suppose that edge M N were not picked in G. Then there would exist an edge M N incident on M in G, such that |M N | ≤ |M N | and N M N ≤ α. Since the rank of (M, N ) is 0, M N is a shortest edge in U , and we have |M N | = |M N |. Consider the isosceles triangle N M N . Since α < π/3, |N N | < |M N |, which implies that (N, N ) is a pair of points of smaller rank than (M, N ), contradicting the choice of (M, N ). Let i be a positive integer, and assume that the statement is true for any pair of points with rank < i. Let (M, N ) be a pair of points of rank i such that M N ∈ U but M N ∈ / G (otherwise we are done). Then there must exist an edge M N ∈ G incident on M such that |M N | ≤ |M N | and N M N ≤ α. Since |M N | ≤ |M N | and α < π/3, we have |N N | < |M N |, and the rank of (N, N ) is smaller than the rank of (M, N ). By the inductive hypothesis, it follows that there is a path PN N from N to N is G such that wt(PN N ) ≤ ρ|N N |. Now the path PMN in G from M to N consisting of edge M N followed by PN N satisﬁes wt(PMN ) ≤ |M N | + ρ|N N |. Let β = N M N and γ = M N N . Since N M N ≤ α and |M N | ≤ |M N |, it follows that 0 ≤ β ≤ α and 0 ≤ γ ≤ (π − β)/2. Consider the triangle N M N ; we have γ−1) |M N |/(|M N |−|N N |) = sin γ/(sin (β + γ)−sin β) = 1/(cos β + sin β(cos )= sin γ 1/(cos β − sin β tan γ2 ) ≤ 1/(cos β − sin β tan π−β 4 ). The last inequality is true because 0 ≤ γ ≤ (π − β)/2. Simplifying the last term by trigonometric identities, β we have |M N |/(|M N | − |N N |) ≤ 1/(cos β − sin β tan π−β 4 ) = 1/(1 − 2 sin 2 ). Since 0 ≤ β ≤ α, we have |M N |/(|M N | − |N N |) ≤ 1/(1 − 2 sin α2 ) = ρ. Because |M N | − |N N | > 0, this implies that |M N | ≤ ρ|M N | − ρ|N N |, and hence wt(PMN ) ≤ |M N | + ρ|N N | ≤ ρ|M N |. The induction is complete.

From Theorem 2, and by setting α = π/6, it follows that the algorithm 3-D Spanner is a 3-local distributed algorithm that constructs a sparse spanner of the UBG U with stretch factor 2.1 and average degree at most 84. As another example, by setting α = π/4, 3-D Spanner is a 3-local algorithm that constructs a sparse spanner of the UBG U with stretch factor 4.3 and average degree at most 45. 2.2

Spanners of Quasi-UBGs

Let Ur , where 0 < r ≤ 1, be a quasi-UBG, and let α < π/3 be given. The results in the previous subsection can be used to devise a local distributed algorithm that constructs a sparse spanner G of Ur .

322

I.A. Kanj, G. Xia, and F. Zhang

The approach is exactly the same as that in [5] for the construction of sparse spanners of quasi-UDGs. For that reason, we only report the results here due to the lack of space and to avoid repetition. We have the following theorem whose proof is very similar to the proof of Theorem 5 in [5]. Theorem 3. Let Ur be a connected quasi-UBG on n points with parameter 0 < r ≤ 1. For any α < π/3, there is a 3-local distributed algorithm that constructs a sparse spanner G of Ur with stretch factor 3/(1 − 2 sin (α/2)). From the above theorem, and by setting α = π/6, it follows that there exists a 3-local distributed algorithm that constructs a sparse spanner of a quasi-UBG U with stretch factor 6.3.

3

Lightweight Spanners

In this section we present local distributed algorithms for the construction of lightweight bounded-degree spanners of UBGs and quasi-UBGs. 3.1

Lightweight Bounded-Degree Spanners of UBGs

We ﬁrst use the 3-local distributed algorithm outlined in Subsection 2.1 to construct a sparse spanner G of U with stretch factor ρ = 1/(1 − 2 sin (α/2)), where α < π/3 is a constant. Of course, G may not be of light weight, and we need to remove edges from G to make it lightweight, while not aﬀecting the stretch factor by much. We start with the following lemma which follows in a straightforward manner from Lemmas 2.2, 2.3, and 2.4 in [7]. Lemma 3. ([7]) Let E be a subgraph of the Euclidean graph E, and let λ > 1 be a constant.3 Suppose that for every cycle C in E and every edge AB ∈ C: wt(C) > (λ + 1) · wt(AB). Then E is lightweight. −→ −→ −→ Assume that we have an orthonormal coordinate system (x x, y y, z z). By the −→ − → −→ orthogonal projection of a cycle C on the x x (resp. y y, z z) axis we mean −→ −→ −→ the interval on the x x (resp. y y, z z) axis consisting of x-coordinates (resp. y-coordinates, z-coordinates) of the points on C. The following simple lemma can be easily veriﬁed. Fact 4. Let C be a cycle of weight at most . Then the orthogonal projection of C on any of the three axes has weight at most /2. Let TI be the translation of vector (0, 0, 0) (the identity translation), Tx the translation of vector (/2, 0, 0), Ty the translation of vector (0, /2, 0), Tz the translation of vector (0, 0, /2), Txy the translation of vector (/2, /2, 0), 3

Note that increasing the value of λ would increase the weight of E , although it remains lightweight.

Local Construction of Spanners in the 3-D Space

323

Txz the translation of vector (/2, 0, /2), Tyz the translation of vector (0, /2, /2) and Txyz the translation of vector (/2, /2, /2). Let T = {TI , Tx , Ty , Tz , Txy , Txz , Tyz , Txyz }. Impose an inﬁnite cubic grid G (each cell in the grid is a cube) on 3-D Euclidean space whose cells are × × cubes, for some positive constant to be determined later. We start with the following simple fact whose proof is easy to verify. Lemma 4. Let C be any cycle of weight at most . There exists a translation T in T such that the translation of C, T (C), resides in a single cell of the cubic grid G. Even though a cycle of weight may not reside within a single cell of C, Lemma 4 shows that by aﬀecting an appropriate translation T ∈ T , the translation of C under T will reside in a single cell. For each translation T ∈ T , the points in G whose translation under T reside in a single cell will form a separate cluster. Then, these points will coordinate the detection and removal of the low-weight cycles residing in the cluster by applying a centralized algorithm to the cluster. Since the clusters do not overlap, and since each cluster works as a centralized unit, this maintains the stretch factor under control, while ensuring the removal of every low weight cycle. The centralized algorithm that we apply to each cluster is the standard greedy algorithm that has been extensively used (see [1] for example) to compute lightweight spanners. Given a graph H and a parameter μ > 1, this greedy algorithm sorts the edges in H in a non-decreasing order of their weight, and starts adding these edges to an empty graph in the sorted order. The algorithm adds an edge AB to the growing graph if and only if no path between A and B whose weight is at most μ · wt(AB) exists in the growing graph. We will call this algorithm Centralized Greedy. The following facts about this greedy algorithm are known: Fact 5. Let H be a subgraph of the Euclidean graph E, and let μ > 1 be a constant. Let H be the subgraph of H constructed by the algorithm Centralized Greedy when applied to H with parameter μ. Then: (i) H is a spanner of H with stretch factor μ. (ii) For any cycle C in H and any edge e on C, wt(C) > (1 + μ) · wt(e). (iii) H has bounded degree. We now present the local distributed algorithm formally and prove that it constructs the desired lightweight spanner. We ﬁrst need the following lemma whose proof uses a folklore packing argument (see for example the proof of Lemma 1 in [17] for the same argument applied to UDGs) and a celebrated sphere-packing density result by Hales [9]. Lemma 5. Let C0 be a cell in G, and let UC0 be the subgraph of U induced by all the points of U residing in cell C0 . If A and √ B are two3 points in the same connected component of UC0 , then A and B are

2 2 · ( + 1) -hop neighbors in U (i.e., A √ and B are at most 2 2 · ( + 1)3 hops away from one another in U ).

324

I.A. Kanj, G. Xia, and F. Zhang

The input to the algorithm is a sparse spanner G of U with stretch factor ρ = 1/(1 − 2 sin (α/2)), where α < π/3, constructed as described in the previous section. We set = λ + 1 in the above cubic grid G, where λ >√1 is a chosen constant. We assume that each point its 2 2 · (λ + 2)3 √ in U has computed 3 hop neighbors in U , where k = 2 2 · (λ + 2) . By Lemma 5, this ensures that every point knows all the points in its connected component residing with it in the same cell under any translation. After that, for every round j ∈ {I, x, y, z, xy, xz, yz, xyz}, each point p ∈ U executes the following algorithm Local-LightSpanner: (i) p applies translation Tj to compute its virtual coordinates under Tj ; Suppose that the translation of p under Tj , Tj (p), resides in cell C0 ∈ T ; (ii) p determines the set Sj (p) of all the points in the resulting subgraph of G (prior to round j) whose translations under Tj reside in the same connected component as Tj (p) in cell C0 ; (iii) p applies the algorithm Centralized Greedy to the subgraph Hj (p) of the resulting graph of G induced by Sj (p) with parameter μ = − 1; if p decides to remove an edge (p, q) from Hj (p) then p removes (p, q) from its adjacency list in G ; Let G be the subgraph of G consisting of the set of remaining edges in G after each point p ∈ G applies the algorithm Local-LightSpanner. The following lemma is similar to Lemma 4 and is easy to verify. Lemma 6. Let p be a point in U , and let q be a neighbor of p in U . Let > 2 be a constant. There exists a translation T in {TI , Tx , Ty , Tz , Txy , Txz , Tyz , Txyz } such that the translations of p and q, T (p) and T (q), reside in a single cell of the cubic grid G. Theorem 6. The subgraph G of G is a lightweight bounded-degree spanner of U with stretch factor ρ · λ8 , where ρ is the stretch factor of G . Proof. Let p be a point in U , and let N (p) be the set of neighbors of p. Split N (p) into eight (possibly overlapping) sets NI (p), Nx (p), Ny (p), Nz (p), Nxy (p), Nxz (p), Nyz (p), Nxyz (p), where Nj (p), for j ∈ {I, x, y, z, xy, xz, yz, xyz}, is the set containing the neighbors q of p such that the translations of p and q under Tj reside in the same cell of G. By Lemma 6, every neighbor of p belongs to some Nj , j ∈ {I, x, y, z, xy, xz, yz, xyz}. Since for every translation j, the algorithm Local-LightSpanner applies the algorithm Centralized Greedy to the points whose translations reside in a single cell of the grid, and since the algorithm Centralized Greedy constructs a bounded-degree subgraph (by Fact 5) of the points whose translations reside in the same cell, at round j only a bounded number of neighbors of p in the set Nj (p) remain in G. Since the number of rounds is bounded, it follows that the number of neighbors of p in G is bounded by a constant, and hence G is of bounded degree. To show that G is lightweight, by Lemma 3, it suﬃces to show that for every cycle C in G and every edge e ∈ C: wt(C) > (λ + 1) · wt(e). Suppose not, and

Local Construction of Spanners in the 3-D Space

325

let cycle C and edge e ∈ C be a counter example. Since every edge in U has weight at most 1, and wt(C) ≤ (λ + 1) · wt(e), it follows that wt(C) ≤ λ + 1 = , and by Lemma 4, there exists a round j in which the translation of C resides in a single cell C0 of G. By part (iii) of Fact 5, after the application of the algorithm Centralized Greedy to the connected component κ containing the translation of C in cell C0 in round j, no cycle of weight smaller or equal to (1+μ)·wt(e) = (1+λ)·wt(e) in the inverse translation of κ remains; in particular, the cycle C will not remain in the resulting graph. This is a contradiction. Finally, it remains to show that the stretch factor of G, with respect to U , is at most ρ · λ8 . Since G has stretch factor ρ, it suﬃces to show that after each of the eight rounds of the algorithm Local-LightSpanner, the stretch factor of the resulting graph increases from the previous round by a multiplicative factor of at most λ. Fix a round j, and let G+ be the graph resulting from G just before the execution of round j, and G− that resulting from G after the execution of round j. Suppose that an edge e is removed by the algorithm in round j. Then the translation of e in round j must reside in a single cell C0 of G. Since by part (i) of Fact 5 the algorithm Centralized Greedy has stretch factor μ = λ, and since a translation is an isometric transformation, a path of weight at most λ · wt(e) remains between the endpoints of e in G− . Therefore, the stretch factor of G− with respect to G+ increases by a multiplicative factor of at most λ during round j. This completes the proof.

Theorem 7. Let U be a connected UBG, and α < π/3 and √ λ > 1 be constants. Then there exists a k-local distributed algorithm with k = 2 2 · (λ + 2)3 , that computes a lightweight bounded-degree spanner of U of stretch factor λ8 /(1 − 2 sin (α/2)). For example, if we set λ = 1.1 and α = π/6, the Local-LightSpanner is an 84-local distributed algorithm that constructs a bounded-degree lightweight spanner of the UBG U with stretch factor at most 4.6. 3.2

Lightweight Bounded-Degree Spanners of Quasi-UBGs

Let Ur be a quasi-UBG with parameter r, where 0 < r ≤ 1, and let α < π/3 be given. To design a local distributed algorithm that constructs a lightweight sparse spanner of Ur , we combine the techniques used in the previous subsection with those used in Subsection 2.2. To avoid repetition, we only outline the approach here. We ﬁrst use the 3-local distributed algorithm outlined in Subsection 2.2 to construct a sparse spanner G of Ur with stretch factor ρ = 3/(1 − 2 sin (α/2)). Of course, G may not be of light weight. To render G lightweight, we use the same techniques used in the previous subsection, where we cluster the points in G to remove short cycles. The only diﬀerence here is that two points that reside within a cubic cell of dimensions × × could be farther apart in a quasi-UBG Ur than in a UBG. The following lemma, which is analogous to Lemma 5, can be easily veriﬁed by the reader.

326

I.A. Kanj, G. Xia, and F. Zhang

Lemma 7. Let C0 be a cell in G, and let UC0 be the subgraph of the quasiUBG Ur of parameter r induced by all the points of Ur residing in tile C0 . If A and B are the same connected component of UC0 , then A √ two points in 3 and B are

2 2 · (( + 1)/r) -hop neighbors in Ur (i.e., A and B are at most √

2 2 · (( + 1)/r)3 hops away from one another in Ur ). Theorem 8. Let Ur be a connected quasi-UBG with parameter r, where 0 < r ≤ 1, and let α < π/3 and λ √ > 1 be constants. Then there exists a k-local distributed algorithm with k = 2 2 · ((λ + 2)/r)3 , that computes a lightweight bounded-degree spanner of Ur of stretch factor (3/(1 − 2 sin (α/2))) · λ8 . For example, if we set λ = 1.1, r = 1/2, and α = π/6 in the above theorem, we obtain a 676-local distributed algorithm that constructs a bounded-degree lightweight spanner of the quasi-UBG Ur with stretch factor at most 13.8.

4

Empirical Results

For the spanner algorithms discussed in Section 2, the experimental results show that the spanners constructed by the algorithms have very small stretch factors. This, in general, suggests that the algorithms are very practical. For the lightweight spanner algorithm presented in Section 3, we developed an implementation that signiﬁcantly reduces the information that each node must collect before carrying out its computation. We note that the lightweight spanner algorithm is applied on top of the spanner algorithm; that is, the input to the lightweight spanner algorithm in Section 3 is the spanner constructed by the algorithm in Section 2. Since UBGs are a special case of quasi-UBGs corresponding to r = 1, we considered the quasi-UBG network model for our simulations. We considered networks of N quasi-UBG nodes with parameter r, and tested our algorithms on ¯ If the distance between two nodes networks with diﬀerent average node-degree d. u, v is in the range (r, 1], the probability that there is an edge between u and v was chosen to be 0.3. The 3-D Euclidean space in which the nodes are deployed is a cubic space of dimensions F × F × F , where F = 3 4π(0.7r3 + 0.3)N/3d¯ is chosen so that, assuming a uniform distribution of the nodes within the cubic space, the average degree of the generated network is d¯ with high probability. For ¯ we generated 100 networks and applied our each set of conﬁgurations (N, r, d), algorithms to each one of them. We took the average of the following measures: the maximum/average degree, maximum/average stretch factor, and the ratio of the weight of the constructed spanner to that of the minimum spanning tree of the network. We ranged the value of d¯ over the values 10, 20, 30, 40, 50, 60, the value of r over the values 0.2, 0.4, 0.6, 0.8, 1.0, and the network size N over the values 1500, 2000, 3000, 4000. Note that our algorithms are local, and their performance is independent of the size of the network. This was conﬁrmed by our simulations: for each ﬁxed average degree value, the computation at a node did not increase by much when the size of the network was increased.

Local Construction of Spanners in the 3-D Space

2

m

1.8

ax

1.6 1.4

50 str

etc

h(d

=

40 ) st re tc h (d¯

=

60 ) avg stretch (d¯ = 40) ¯ avg stretch (d = 60)

1.2 1 0.8 0.2

0.4

0.6 r

3.5 orig spanner

40

0.8

weight v.s. MST

ma x

2.2

327

3

max stretch

2.5

30

2 20

1.5

10

1

0

0.5

1

0.2

(a)

0.4

0.6 r

0.8

1

avg stretch 0.2

(b)

0.4

0.6 r

0.8

1

(c)

Fig. 1. Statistics on the simulation results for the spanners Table 1. Locality for the lightweight spanner algorithm

H d¯ r HH 10 1.0 0.8 0.6 0.4 0.2

9.86 9.72 9.4 8.88 8.16

15 6.42 7.14 7.12 7 6.82

20 6.04 6.22 6.3 6.14 6.08

25 5.78 6.02 6.02 6.02 6.02

30 5.26 5.92 6 6.02 6

The stretch factors for sparse spanners are shown in part (a) of Fig. 1. We see that the maximum stretch factor is always smaller than 2.5, and that the average stretch factor is always close to 1. The simulation results for lightweight spanners are given in part (b),(c) of Figure 1. The simulations were performed on networks with average degree 30. The value of the parameter is set to 5, and this corresponds to a value of 4 for μ in the Centralized Greedy algorithm. We chose the value of α to be slightly smaller than π/3. From part (c) we see that the stretch factor of the sparse spanner is close to 3, whereas the average stretch factor is very close to 1. Part (b) of Figure 1 shows that even when r is as small as 0.2, the weight of the spanner is only 6.15 times that of an MST of the network, whereas the total weight of the original networks is at least 43 times that of an MST. In Table 1 we show empirical upper bounds on the “average locality” of the algorithm. This upper bound is obtained by averaging the number of points in each connected component of a cell that the algorithm Centralized Greedy is applied to. Whereas the theoretical upper bounds on the locality of the algorithm (i.e., upper bound on k) were very large (in the order of hundreds when r = 1/2), the empirical bounds obtained are very reasonable: the values are always less than 10! This suggests that the algorithm should be very practical. Acknowledgements. We would like to thank Gary Gordon for helpful discussions related to the paper. The ﬁrst author was supported in part by a DePaul University Competitive Research grant. The second author’s work was supported in part by a Lafayette College Research Grant. The work of the third author was done while he was at the Department of Computer Science, Texas A&M University, College Station, TX 77843, USA.

328

I.A. Kanj, G. Xia, and F. Zhang

References 1. Alth¨ ofer, I., Das, G., Dobkin, D., Joseph, D., Soares, J.: On sparse spanners of weighted graphs. Discrete & Computational Geometry 9, 81–100 (1993) 2. Barri`ere, L., Fraigniaud, P., Narayanan, L.: Robust position-based routing in wireless Ad Hoc networks with unstable transmission ranges. In: DIALM, pp. 19–27 (2001) 3. Bose, P., Gudmundsson, J., Smid, M.: Constructing plane spanners of bounded degree and low weight. Algorithmica 42(3-4), 249–264 (2005) 4. Brinkmann, G., Dress, A.W.M.: A constructive enumeration of fullerenes. J. Algorithms 23(2), 345–358 (1997) 5. Chen, J., Jiang, A., Kanj, I., Xia, G., Zhang, F.: Separability and topology control of quasi unit disk graphs. In: Proceedings of INFOCOM, pp. 2225–2233 (2007) 6. Damian, M., Pandit, S., Pemmaraju, S.: Local approximation schemes for topology control. In: Proceedings of PODC, pp. 208–217 (2006) 7. Das, G., Heﬀernan, P., Narasimhan, G.: Optimally sparse spanners in 3-D Euclidean space. In: Proceedings of SoCG, pp. 53–62 (1993) 8. Gudmundsson, J., Levcopoulos, C., Narasimhan, G.: Fast greedy algorithms for constructing sparse geometric spanners. SIAM J. Comput. 31(5), 1479–1500 (2002) 9. Hales, T.C.: Sphere packings, vi. tame graphs and linear programs. Discrete & Computational Geometry 36(1), 205–265 (2006) 10. Kanj, I., Wiese, A., Zhang, F.: Computing the k-hop neighborhoods locally. Technique report #08-007, http://www.cdm.depaul.edu/research/Pages/TechnicalReports.aspx 11. Kroto, H.W., Heath, J.R., O’Brien, S.C., Curl, R.F., Smalley, R.E.: C60 : buckminsterfullerene. Nature 318, 162–163 12. Levcopoulos, C., Lingas, A.: There are planar graphs almost as good as the complete graphs and almost as cheap as minimum spanning trees. Algorithmica 8(3), 251–256 (1992) 13. Li, X.-Y., Song, W.-Z., Wang, W.: A uniﬁed energy-eﬃcient topology for unicast and broadcast. In: MOBICOM, pp. 1–15 (2005) 14. Linial, N.: Locality in distributed graph algorithms. SIAM J. Comput. 21(1), 193– 201 (1992) 15. Malkevitch, J.: A note on fullerenes. Fullerenes, Nanotubes and Carbon Nanostructures 2(4), 423–426 (1994) 16. Peleg, D.: Distributed computing: A Locality-Sensitive Approach. SIAM Monographs on Discrete Mathematis and Applications (2000) 17. Wiese, A., Kranakis, E.: Local construction and coloring of spanners of location aware unit disk graphs. In: WG, pp. 372–383 (2008) 18. Yao, A.C.-C.: On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing 11(4), 721–736 (1982)

Combining Positioning and Communication Using UWB Transceivers Paul Alcock, Utz Roedig, and Mike Hazas Lancaster University, UK {p.alcock,u.roedig,m.hazas}@lancaster.ac.uk

Abstract. A new generation of ultra wideband (UWB) communication transceivers are becoming available which support both positioning and communication tasks. Transceiver manufacturers envision that communication and positioning features are used separately. We believe that this is an unnecessary restriction of the available hardware and that positioning and communication tasks can be active concurrently. This paper presents and investigates a medium access control (MAC) protocol which combines communication and positioning functions. Our experiments show that the existing data communication of a network can be exploited to gather position information eﬃciently.

1

Introduction

Many positioning systems have been developed which use the existing communication transceiver of a sensor node. Positioning systems relying on conventional low-power communication transceivers typically make use of either the received signal strength (RSS) or the measured time-of-ﬂight (TOF) of a signal as input for a positioning algorithm. Both methods can be used to determine the distance between transceivers and ultimately the position of all transceivers in relation to each other. These methods of distance measurements have been investigated at length and reports show that using current transceivers yield unreliable and inaccurate results. Patwari et al. [1] present an in-depth report of their ﬁndings of how multipath signals and shadowing obscure distance measurements. The recent development of low-power, ultra wideband (UWB) transceivers for use in sensor nodes, overcomes the aforementioned ranging inaccuracies. The physical signal properties of UWB communication make it possible to accurately determine the time-of-arrival (TOA) of signals. By utilizing either clock synchronization or two-way-ranging it is therefore possible to accurately determine the time of ﬂight (TOF) of the signal. Thus, the distance between communicating transceivers and node positions can be determined. The IEEE 802.15.4a physical layer speciﬁcation [2], standardized in 2007, deﬁnes the use of UWB transceivers for use in wireless personal area networks and the functionality of positioning. The Nanotron nanoLOC TRX [3] transceiver is an example of one such transceiver which adheres to this standard. UWB transceiver manufacturers envision that communication and positioning features are used separately and one at a time. Either the transceiver is used to B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 329–342, 2009. c Springer-Verlag Berlin Heidelberg 2009

330

P. Alcock, U. Roedig, and M. Hazas

transfer data packets between sender and receiver or the transceiver is used to send ranging packets to determine the TOF between nodes. We argue that this leads to ineﬃcient transceiver usage as excess packets might be generated unnecessarily. If an exchange of data packets is currently taking place between two nodes using a send and acknowledge scheme, the same packets can also be used to measure TOF between the nodes. Therefore the distance can be estimated using the existing data packets, alleviating the need to transmit specialized ranging packets. If however there is currently not enough data communication taking place between nodes to accurately satisfy the positioning needs of the application, ranging packets may still need to be generated. Evidently the interplay of the transmission of data packets and ranging packets has to be organized to achieve communication and positioning goals. Transceiver usage for communication purposes is deﬁned by the MAC layer. The MAC layer determines when packets are transmitted and how their transmission shall be organized. Thus we propose to combine positioning and communication tasks within the MAC layer. This paper describes in general how positioning and communication tasks can be combined within the MAC layer on nodes which utilize an UWB transceiver. In particular, the paper shows how the existing FrameComm MAC protocol [13,15] can be extended to support positioning tasks. The resulting MAC protocol can be used with any transceivers adhering to the 802.15.4a IEEE standard. This paper has the following contributions: – Protocol Speciﬁcation: A detailed description of the necessary FrameComm modiﬁcations to integrate communication and positioning functions is given. – Protocol Evaluation: A comprehensive evaluation using simulations of the modiﬁed FrameComm MAC protocol is presented. The simulations show that the necessary modiﬁcations to FrameComm lead to an acceptable level of reduced network performance and a marginal increased energy consumption. The results indicate that positioning information can be collected almost for free if positioning and communication functions are combined as proposed. The next Section gives an overview on related work. Section 3 describes the basic framelet communication mechanism. Sections 4 details the proposed enhancements for existing framelet based MAC protocols, combining ranging estimations and communication as a single function. Section 5 outlines the simulation testbed and results of these experiments. The paper then concludes in section 6 and describes proposed future work.

2

Related Work

There is a large body of work which has focused on exploiting UWB for either communication or positioning in wireless sensor networks. However, there is little research on how to tightly integrate both positioning and communication functions. Gezici et al. [5] provide an in-depth introduction into the use of UWB as a means of positioning. They discuss the fundamental positioning techniques that

Combining Positioning and Communication Using UWB Transceivers

331

make our work possible, and how to reduce sources of error in UWB location estimation. Alsindi and Pahlavan [7] analyze the use of UWB positioning in WSNs using a cooperative location algorithm. They determine bounds for UWB location accuracy in a number of challenging indoor environments, and discuss issues relating to range estimation accuracy. Correal et al. [6] present a method of positioning using UWB transceivers in which a packet sent by a node is followed by acknowledgements which can be used to derive round-trip times. This method provides a compelling proof-of-concept for our proposed system. However, the method discussed by Correal et al. diﬀers from our method in that ranging is not formally integrated into the protocol, and there is no analysis of how their method of positioning and communication functions aﬀect one another. Cheong and Oppermann [8] describe a positioning-enabled MAC protocol for UWB sensor networks. First, their solution diﬀers from our work as data packets themselves are not used to support positioning; positioning and communication are handled completely separately by the MAC layer. Second, Cheong’s work proposes a TDMA protocol, while the modiﬁed FrameComm protocol presented in this paper is a contention-based protocol. The IEEE 802.15.4a physical layer speciﬁcation [2], standardized in 2007, deﬁnes the use of UWB transceivers in wireless personal area networks. The standard deﬁnes positioning and communication as separate functions but does not discuss their integration. However, modern packet-based transceivers conﬁrming to the 802.15.4a standard could potentially be used to support the MAC protocol deﬁned in this paper. The use of packetized radios requires a fresh approach of implementing asynchronous duty cycles in WSNs. Some schemes use the same concept of framelet trails as used by the FrameComm [13] MAC protocol used for the work presented in this paper. The current default energy saving protocol in TinyOS is based on the Low Power Listening component of BMAC[9], but employs message retransmission instead of a long preamble in order to accommodate packet-based radios. X-MAC [10] also uses framelets to establish rendezvous between sender and receiver but only retransmits the message header. The payload is sent only after one of the headers has been acknowledged by the destination. Other related duty-cycled schemes include Koala [11] and CSMA-MPS [12]. These and other existing framelet based MAC protocols can potentially be used in conjunction with UWB transceivers to integrate positioning and communication. Hence, the basic mechanisms described in this paper are not limited to the particular MAC protocol we have chosen (FrameComm).

3

FrameComm

FrameComm, like many wireless contention based MAC protocols, performs duty cycling of node transceivers. To ensure that rendezvous between transceivers occur, FrameComm deploys a method in which a trail of identical packets of data, called framelets, is transmitted by the sender with gaps between each. The receiver sends an acknowledgement to the source after successfully

332

P. Alcock, U. Roedig, and M. Hazas Distance DATA N1-N3 Known

DATA

Distance DATA N1-N3 Known

SENDER N1 RADIO ON DATA

DATA

RADIO OFF

DATA

SENDER N1

ACK

RADIO ON

RADIO OFF

RECEIVER N2 RADIO OFF

ACK

RECEIVER N2

OVERHEARING NODE N3 RADIO OFF

RADIO ON

RADIO OFF

RADIO OFF

a)

RADIO ON

RADIO OFF

Ranging ACK

RADIO ON

RADIO OFF

b) Fig. 1. FrameComm comunication

receiving a framelet. Upon the reception of this acknowledgement, the sender may then cease sending and yield control of the channel (See Fig. 1). A full description of FrameComm is given in [13]. The following paragraphs explain the basic functionality of the protocol and of the elements used to integrate positioning features as explained in Section 4. 3.1

Assumptions and Deﬁnitions

It is assumed that the clocks of the transmitter and receiver operate at approximately the same rate. Note that this does not imply time or sleep cycle synchronization; rather the clock drift between any two nodes is insigniﬁcant over a short period. It is also assumed that a ﬁxed rate radio duty cycle is used, i.e., each node periodically activates its radio for a ﬁxed time interval to monitor activity in the channel. The duty cycle period is represented as P = + 0 , where is the time the radio remains active and 0 is the time the radio is in sleep mode. The duty cycle ratio is deﬁned as: DutyCycle = 3.2

D = P + 0

(1)

Rendezvous Using Framelets

Framelets are small, ﬁxed-sized frames that can be transmitted at relatively high speeds. Successful duty cycle rendezvous require a sequence of identical frames to be repeatedly transmitted from the source node; each frame contains the entire payload of the intended message as depicted in Fig. 1. If the receiver captures one of these, the payload is delivered. The trail of framelets is deﬁned by three parameters: Number of transmissions: n ; time between framelets: δ0 ; framelet transmission time: δ . To achieve successful rendezvous a relationship must be established between the parameters , 0 , n, δ, and δ0 . First, the listening phase of the duty cycle must be such that: ≥ 2·δ+δ0 . This ensures that at least one full framelet will be intercepted during a listen phase. Furthermore, to ensure overlap between transmission and listening activities, the number of retransmissions n needs to comply with the following inequality when 0 > 0 : n ≥ [0 + 2 · δ + δ0 /(δ + δ0 )]. This ensures that a framelet trail is suﬃciently long enough to guarantee rendezvous

Combining Positioning and Communication Using UWB Transceivers

333

with the listening phase of the receiver and ensures that at least one framelet can be correctly received. The duration of determines message delay, throughput and energy savings. 3.3

Message Acknowledgments

Between framelet transmissions, the source node switches its radio to a listening state. Upon successful reception of a frame at the destination node, this receiving node should respond with an acknowledgement transmitted during the framelet transmission gaps δ0 . After reception of this acknowledgment the sender should terminate transmission of its framelet trail as communication has been successful. Using acknowledgments reduces the amount of framelets needed for each transmission, and as a result, transmissions will occupy the channel for a shorter period of time, reducing contention whilst increasing throughput and energy eﬃciency.

4

FrameComm with Positioning

The basic principle of FrameComm is ideally suited for the integration of positioning functions. The method of exchanging packets and acknowledgements mirrors that of two-way-ranging methods used to determine the round-trip-time, and ultimately the TOF of signals. If the sender records the time of transmission of its last framelet, and the time upon receiving its acknowledgement, the distance between nodes can be determined. We propose extending this positioning enhancement further, in such a manner that the sender may derive not only the distance to its intended recipient, but potentially the distance to any node within transmission range. During the exchange of framelets between the sender and receiver, a third node may enter its listening period and overhear a framelet. Before discarding the framelet and returning to sleep, the node exploits this overhearing and sends what we call a ranging acknowledgement. Upon receiving the ranging acknowledgement the sender knows the distance to this third node (See Fig. 1.b.). 4.1

Basic Ranging

To determine the distance between two communicating nodes the time-of-ﬂight (TOF) of exchanged signals needs to be measured. To avoid the need of tight clock synchronization between both nodes, two-way-ranging can be performed using the existing FrameComm data exchange. The sender of a message keeps track of the time tt when a framelet is transmitted. If an acknowledgement is received, its arrival time ta is recorded. The TOF can be determined using tt and ta if the processing time tp at the message receiver is known. The processing time tp is the time required by the message receiver to respond with an acknowledgement to the received framelet. It is assumed that the processing time tp is constant and thus known by the message transmitter. The TOF can be calculated as: T OF = (ta − tt − tp )/2.

334

P. Alcock, U. Roedig, and M. Hazas

The distance between the two nodes is proportional to the measured TOF. The measured TOF is in the order of nanoseconds for most wireless sensor networks where communication ranges are below one hundred metres. Time measurements on such scale must be performed by hardware on the used UWB transceiver chip. Available hardware such as the NanoLoc transceiver provide measurement facilities and automatic acknowledgements which allow us to determine tt , ta and tp . It has to be noted that a transmitter of a message can determine the distance to the message receiver without consuming additional energy for ranging as existing messages are used. Likewise, network performance in terms of achievable throughput and message transfer delay is not degraded by introducing ranging. 4.2

Ranging Acknowledgements

The previously outlined basic ranging mechanism can be improved by introducing ranging acknowledgements. The improvement exploits the fact that nodes not directly involved in the message transport might overhear framelets. During regular communication a source node will generate data and begin transmitting its framelet trail and await an acknowledgement. It is possible for nodes whom the packet is not the intended recipient to overhear framelets of the transmission. Normally, a node overhearing a packet not addressed to it would simply ignore the received packet and enter its sleep cycle. However, to improve ranging we propose that a node sends a ranging acknowledgement packet before entering the sleep state. Thus, a sender of a message does not only obtain the distance to the communication partner, but will potentially also collect distance information to nodes overhearing the communication (See Fig. 1.b.). This ranging acknowledgement is not sent immediately after the framelet is received. The transmission of the ranging acknowledgement is delayed by the time needed to transmit a message acknowledgement. Thus, collisions between ranging acknowldgements and with the message acknowlegement are avoided (See Fig. 1.b.). In some cases, ranging acknowledgements transmitted by several overhearing nodes in response to the same framelet might collide. However, this will only reduce the eﬀectiveness of the positioning function of FrameComm but will not have an impact on message transmission or network performance. Ranging acknowledgements are transmitted within the gaps of an existing framelet trail. Thus, the introduction of ranging acknowledgements has no immediate impact on the network performance in terms of message transfer delay or network throughput (See experimental evaluation in Section 5). Energy consumption of nodes is increased by the introduction of ranging acknowledgements as additional messages need to be transmitted. However, our experiments show that this increase is acceptably small. 4.3

Selective Ranging

Ranging acknowledgements only need to be transmitted if either the current sender of a framelet transmission or the nodes overhearing the transmission

Combining Positioning and Communication Using UWB Transceivers

335

have changed position since the last distance measurement. To facilitate selective ranging, additional information has to be included within each framelet. The additional information is used to signal to the overhearing node that it is necessary to respond with a ranging acknowledgement. To keep communication overhead minimal, one additional bit (the ranging ﬂag) is used to signify selective ranging in each framelet. If this ranging ﬂag is set, an overhearing node will respond with a ranging acknowledgement. In addition, if the overhearing node has determined itself to have moved it will always respond with a ranging acknowledgement, regardless if the ranging ﬂag is set or not. To implement the outlined functionality a node must be able to determine if its position has changed. This can either be achieved by using additional hardware such as an accelerometer or by analyzing the collected distance measurements of a node. In addition, a threshold might be deﬁned to ensure that only signiﬁcant changes in position result in sending new ranging acknowledgements. 4.4

Usage of Distance Information

While communicating, a node collects distance information to its immediate neighbors. This distance information can be used for diﬀerent purposes. Obviously, the distance information can be used by positioning algorithms either locally on the node or centrally somewhere in the network. However, the distance information can also be used to deal with node mobility in an eﬀective way. If a node forwards a message and does not receive a data acknowledgement after sending all framelets it might indicate that the destination is no longer within communication range. In this case the existing table of distances to neighboring nodes can be used to select a new destination node to forward the message. The destination table contains set of nodes which are in communication range. The exact implementation of this would be speciﬁc to the routing strategy employed.

5

Evaluation

The modiﬁed FrameComm protocol as outlined in the previous section is evaluated using a simulation environment. We used our own purpose-built simulator written in C++ as other available simulation environments such as tossim or ns2 did not provide the necessary ﬁne-grained UWB transceiver model necessary to determine transceiver activity and thus energy consumption of nodes. The correct function of the simulation environment was calibrated using realworld experiments with the standard FrameComm protocol on Tmote Sky nodes to ensure accuracy. 5.1

Application Scenario

A small warehouse is used to store crates which contain medicine. The medicine needs to be kept cool below a speciﬁc temperature at all times to ensure eﬀectiveness, therefore constant monitoring is required to audit temperatures. It is

336

P. Alcock, U. Roedig, and M. Hazas

necessary to track the location of each crate in order to ﬁnd them easily, and in addition it is necessary to know where temperature readings were taken. Topology. In the network topology, shown in Fig. 2, a sink node located in a corner of the warehouse is used to collect data. Fourteen static nodes are used in the warehouse to create an infrastructure. The static nodes ensure that any node placed in the storage area has connectivity to the network. For the experiments, twelve mobile nodes are used to monitor crates. Initially these twelve free-to-roam mobile nodes are placed in a grid as shown in Fig. 2. The communication range of each node is seven metres. Routing paths according to the arrows depicted in Fig. 2 are used initially; if nodes move, the routing topology will change as described later in the experiments. Traﬃc. Within each experiment all nodes generate messages periodically which are routed hop-by-hop towards the sink node. The message generation frequency λ is used as parameter in the experiments. A node divides time into slots of the length 1/λ; the point in time within a slot when the message is generated is determined randomly (using a uniform distribution). Thus, nodes in the network do not generate messages synchronously. In all of the following experiments nodes are conﬁgured with a limited MAC buﬀer size of b = 3. A node can hold three messages in addition to the one that might currently be in processing. Messages are placed in this buﬀer when a node generates a message or receives a message for forwarding. Messages are dropped when the buﬀer is full. If a message is not acknowledged by a receiver (for example, a framelet or an acknowledgement was destroyed by a collision) it remains in the buﬀer and the node will re-transmit the message indeﬁnitely. Thus, message losses in the network occur solely due to full MAC buﬀers. 5.2

Experiment 1: The Cost of Positioning

The ﬁrst experiment is used to determine the cost of introducing positioning functions to the FrameComm protocol. To determine the cost the previously described application scenario is ﬁrst utilized with the standard FrameComm protocol and thereafter with the extended FrameComm protocol as previously described in Section 4. For this experiment all nodes have static locations and use ﬁxed routing paths as shown in Fig. 2. Each experimental run has the duration of 600 s. Diﬀerent traﬃc loads are used which are denoted by the message generation frequency λ. The message throughput φn achieved by each node nn during the experiment is measured. φn is deﬁned as the number of messages generated by node nn which are received at the sink during the experiment run. The relationship between message generation rate and throughput is φn ≤ 1/λ; in a congested network messages will be lost along the transport path. The time τn that the transceiver of each node spends in an active states (sending or listening) during the experiment is measured in as a percentage of the experiment duration. This ﬁgure is used to determine the energy consumption of each node. It has to be noted that τn is larger than the

Combining Positioning and Communication Using UWB Transceivers

337

Fig. 2. Network Topology used for the evaluation

duty cycle as deﬁned in Equation (1) as it includes not only idle listening but also the sending operations. Fig. 3 shows the measured values for experimental runs for throughput φn and energy consumption τn for λ = 1 and λ = 10. When λ = 1 the network is overloaded and a relatively large number of messages are lost. For λ = 10 the network is loaded moderately and a low number of losses is observed (less than 5% for all nodes). Fig. 3 b) and d) show that diﬀerent nodes have a very diﬀerent energy consumption pattern. These pattern emerge due to diﬀerent node positions in the topology. The nodes n1 , n3 , n15 , for example, have a low energy consumption as they forward messages to the sink, whose transceiver is continually listening/active. Thus, these three nodes can forward messages quickly, as the ﬁrst framelet in a trail is always received. Nodes n12 , n13 , n14 have a low energy consumption as they do not have to forward traﬃc in this particular experiment. Node n2 generally has the highest energy consumption as it has to forward messages from four nodes and is not directly connected to the sink. For most nodes, the additional positioning functionality leads to an increase in energy consumption (τmax = τ25 = 5.38% for λ = 1; τmax = τ12 = 35.3% for λ = 10). This increase is expected as additional ranging acknowledgements have to be transmitted. However, some nodes consume less energy when ranging functionality is included. This appears contradictory. However, the throughput slightly decreases when positioning is used; as ranging acknowledgements might collide with framelets of an ongoing message transmission. Although ranging acknowledgements are oﬀset to avoid collision with the current, overheard conversation, it is sometimes possible for a ranging acknowledgement to interfere with the data exchange between neighboring nodes. Node n2 has to forward traﬃc from four nodes which all achieve a slightly lower throughput when positioning is enabled. Node n2 therefore consumes less energy when positioning is enabled as it has to forward less packets. n=26 1 Fig. 4 shows the average throughput φ = 12 n=15 φn and average energy n=26 1 consumption τ = 26 τ of nodes for all traﬃc rates 1 ≤ λ ≤ 20. Fig. 4 n n=1 shows that for high traﬃc loads, λ < 5, the throughput for the positioning

338

P. Alcock, U. Roedig, and M. Hazas Average Per Node Throughput For Message Interval - 1 Second(s)

Transceiver On Times For Message Interval - 1 Second(s)

1

100 non-ranging ranging

0.8

Trasciever Active Time (%)

Node throughput (packets/s)

non-ranging ranging

0.6

0.4

0.2

80

60

40

20

0

0 15

16

17

18

19

20

21

22

23

24

25

26

27

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Node ID

Node ID

a) Throughput, traﬃc rate λ = 1

b) Energy Consumption, traﬃc rate λ = 1

Average Per Node Throughput For Message Interval - 10 Second(s)

Transceiver On Times For Message Interval - 10 Second(s)

1

100 non-ranging ranging

0.8

Trasciever Active Time (%)

Node throughput (packets/s)

non-ranging ranging

0.6

0.4

0.2

80

60

40

20

0

0 15

16

17

18

19

20

21

22

23

24

25

26

27

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Node ID

Node ID

c) Throughput, traﬃc rate λ = 10

d) Energy Consumption, traﬃc rate λ = 10

Fig. 3. Throughput φn and energy consumption τn for λ = 1 and λ = 10 Average Node Throughput

Average Transceiver On Time

0.35

35 non-ranging ranging

0.3

30

Transceiver On Time (%)

Average Throughput (packets/s)

non-ranging ranging

0.25

0.2

0.15

0.1

0.05

25

20

15

10

5

0

0 0

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21

Packet Intervals (s)

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

Packet Intervals (s)

Fig. 4. Average throughput φ and average energy consumption τ for 1 ≤ λ ≤ 20

enabled FrameComm is smaller than for the standard FrameComm. Framelets are lost due to collisions with ranging acknowledgements. As a consequence, the energy consumption for the positioning enabled FrameComm is reduced as less traﬃc is transported. For low traﬃc loads, λ ≥ 5, the throughput for both FrameComm variants is similar. Thus, a similar amount of traﬃc is transported in both settings and the additional eﬀort for ranging acknowledgements is visible in the increased energy consumption for FrameComm with positioning. This additional cost in terms of energy is at most 15% (for n2 at λ = 5) and in average 4.57% for all nodes at all traﬃc loads 5 < λ ≤ 20.

Combining Positioning and Communication Using UWB Transceivers Average Node Thrpughput

Average Transciever On Time

0.4

25 non-ranging ranging ranging optimised

0.35

non-ranging ranging ranging optimised 20

Transceiver On Time (%)

Average Throughput (packets/s)

339

0.3

0.25

0.2

0.15

0.1

15

10

5 0.05

0

0 0

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21

0

1

2

3

4

Packet Transmission Interval (s)

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

Packet Transmission Interval (s)

Fig. 5. Results Of Selective Ranging Optimization

5.3

Experiment 2: Selective Ranging

The second experiment is designed to evaluate the proposed optimization as described in Section 4.3. Nodes only respond with ranging acknowledgements to overheard messages if either the overhearing node has moved, or the ranging bit is set in the overheard message header. To evaluate the selective ranging feature some node mobility is required. Node n26 is set in constant motion for the duration of the experiment on a ﬁxed path between its starting position of (x = 25, y = 17) (See Fig. 2) and coordinate (x = 5, y = 3), at a constant velocity v = 1m/s. This movement pattern is utilized throughout the experiement for the traﬃc rates 1 ≤ λ ≤ 20. To ensure fair comparison between diﬀerent evaluation runs, deterministic, hardcoded handover sequences are enforced to reallocate forwarding nodes once the node moves outside of communication range of their current forwarding node. The experiments are each run for a duration of 600 s. The message throughput φn and the energy consumption τn as deﬁned in Section 5.2 are recorded. The results are shown in Fig. 5. For low traﬃc loads, λ ≥ 5, the throughput for all evaluated FrameComm variants is similar. As expected, the selective ranging FrameComm variant requires nearly as much energy as the standard FrameComm variant without ranging. Thus, the selective ranging optimisation allows us to obtain ranging measurements at a very low energy cost. This additional cost in terms of energy is at most 7.02% (for n5 at λ = 4) and in average 0.1% for all nodes at all traﬃc loads 5 < λ ≤ 20. 5.4

Experiment 3: Location Update Frequency

The last experiment is used to evaluate how well the position enabled FrameComm protocol can be used together with a centralised position algorithm. Centralised location algorithms are less energy-eﬃcient than distributed algorithms as each node must forward ranging measurements to the node responsible for computing location [1]. However, for sensor network deployments in which nodes forward sensor readings to the same node that also executes the centralised positioning algorithm, ranging measurements can be transferred at very little extra cost. The ranging measurements can be piggy-backed on data

340

P. Alcock, U. Roedig, and M. Hazas Location Update Count For Varying Traffic Rates 16

14

Location Update Delay

12

10

8

6

4

2

0 0

5

10

15

20

Packet Generate Rate (s)

Fig. 6. Average location update frequency of node ω26

packets. Often, ﬁxed packet sizes are used which are underutilised and thus ranging measurements can be transported at no additional cost. We use this method to evaluate the positioning capabilities of a sensor network using the positioning enabled FrameComm MAC protocol. For the purposes of this evaluation, we have chosen to implement a basic centralised trilateration algorithm. Each node creates a table containing distance information to its neighbors using the positioning-enabled FrameComm protocol (See Section 4.4). The content of this table is transmitted with each message reporting a sensor reading to the sink. The central or sink node gathers and stores the most current ranging information of each node. If for a given node, there exist range measurements from three reference nodes which form intersecting circles, the location of the node can be determined. All infrastructure nodes as outlined in Section 5.1 are used as reference nodes within the evaluation setup. For evaluation, node n26 is selected to be a mobile node constantly in motion using a random waypoint model; all other nodes remain at ﬁxed position as shown in Fig. 2. Node n26 randomly determines a destination location and the node moves with speed v = 1m/s towards this destination. When reaching this position, the node selects a new random destination. The location update frequency ωn is analysed in this experiment. The achievable location update frequency indicates how well a system is able to keep track of node positions. ωn is deﬁned as the time between two position updates for node nn . The shorter the time between calculated position updates at the sink the better the system is aware of the correct position of nodes. Fig. 6 shows the average location update frequency w26 of node n26 . As expected, the location update frequency increases liniarly with the linear increasing traﬃc load (5 ≤ λ ≤ 20). With a reduced traﬃc load a decreasing number of ranging measurements reaches the sink and the location update frequency is increasing accordingly. However, for high traﬃc loads (λ < 5) an inverse pattern is observed. For the high traﬃc loads the network capacity is reached and packets are dropped. The more traﬃc is oﬀered to the network the more packets have to be dropped and this can lead to a high number of packets being dropped which are required to calculate the position of node n26 . We believe it is possible to keep the average location update frequency constant low for λ < 5 if the network implements an appropriate packet dropping strategy. An available

Combining Positioning and Communication Using UWB Transceivers

341

FrameComm extension called priority interrupts [13] is available which could help to implement such mechanism. The experiments show that communication and position estimation are linked in two ways. First, there is a complex interdependency between data transport and the gathering of ranging measurements as shown in the previous two experiments. Second, there is a non-trivial interdependency between data transport and location estimation as shown in this experiment.

6

Conclusion

We have shown in this paper how positioning and communication tasks can be combined within the MAC layer on nodes which utilize an UWB transceiver. We investigated how the existing FrameComm MAC protocol can be extended to support positioning tasks. The experiments show that the positioning-enabled FrameComm MAC protocol consumes in avarage 4.57% more energy in the investigated scenarios than the standard FrameComm protocol. Using selective ranging this value can be reduced further to 0.1% in the scenarios investigated. We believe that this is an acceptable trade-oﬀ, especially as other network parameters such as throughput are not signiﬁcantly altered by implementing ranging functionality. The experiments also show that if ranging information is transported together with sensing information, a strategy is required for dropping packets in overload situations. In a next step we plan to investigate the proposed positioning-enabled FrameComm protocol in more detail and we plan to run real-world experiments using sensor nodes with IEEE 802.15.4a transceivers.

References 1. Patwari, N., Ash, J.N., Kyperountas, S., Hero, A.O., Moses, R.L., Correal, N.S.: Locating the nodes. Signal Processing Magazine 22(4), 54–69 (2005) 2. IEEE 802.15 WPAN Low Rate Alternative PHY Task Group 4a (TG4a), http://ieee802.org/15/pub/TG4a.html 3. Nanotron nanoLOC TRX Data Sheet (2007), http://www.nanotron.com 4. Federal Communications Commission, Revision of Part 15 of the Commissions Rules Regarding Ultra-Wideband Transmission Systems, http://hraunfoss.fcc.gov/edocs_public/attachmatch/FCC-02-48A1.pdf 5. Gezici, S., Tian, Z., Giannakis, G.B., Kobayashi, H., Molisch, A.F., Poor, H.V., Sahinoglu, Z.: Localization via ultra-wideband radios: a look at positioning aspects for future sensor networks. Signal Processing Magazine 22(4), 70–84 (2005) 6. Correal, N.S., Kyperountas, S., Shi, Q., Welborn, M.: An uwb relative location system. In: IEEE Conference on Ultra Wideband Systems and Technologies, 2003, pp. 394–397 (2003) 7. Alsindi, N., Pahlavan, K.: Cooperative Localization Bounds for Indoor UltraWideband Wireless Sensor Networks. EURASIP Journal on Advances in Signal Processing 2008, Article ID 852509, 13 (2008) 8. Cheong, P., Oppermann, I.: An Energy-Eﬃcient Positioning-Enabled MAC Protocol (PMAC) for UWB Sensor Networks. In: Proceedings of IST Mobile and Wireless Communications Summit, Dresden, Germany, pp. 95–107 (June 2005)

342

P. Alcock, U. Roedig, and M. Hazas

9. Polastre, J., Hill, J., Culler, D.: Versatile low power media access for wireless sensor networks. In: SenSys 2004: Proceedings of the 2nd international conference on Embedded networked sensor systems, pp. 95–107. ACM Press, New York (2004) 10. Buettner, M., Yee, G.V., Anderson, E., Han, R.: X-mac: a short preamble mac protocol for duty-cycled wireless sensor networks. In: SenSys 2006: Proceedings of the 4th international conference on Embedded networked sensor systems, pp. 307–320. ACM Press, New York (2006) 11. Musaloiu-E, R., Liang, C., Terzis, A.: Koala: Ultra-Low Power Data Retrieval in Wireless Sensor Networks. In: Proceedings of the 7th international conference on Information processing in sensor networks, April 22-24, pp. 421–432 (2008) 12. Mahlknecht, S., Bock, M.: CSMA-MPS: a minimum preamble sampling MAC protocol for low power wireless sensor networks. In: Proceedings of the 2004 IEEE International Workshop on Factory Communication Systems, pp. 73–80 (2004) 13. O’Donovan, T., Benson, J., Roedig, U., Sreenan, C.: Priority Interrupts of Duty Cycled Communications in Wireless Sensor Networks. In: Proceedings of the Third IEEE International Workshop on Practical Issues in Building Sensor Network Applications (SENSEAPP 2008), Montreal, Canada. IEEE Computer Society Press, Los Alamitos (2008) 14. Chipcon Products From Texas Instruments, CC2420 - Data sheet, http://focus.ti.com/lit/ds/symlink/cc2420.pdf 15. Benson, J., O’Donovan, T., Roedig, U., Sreenan, C.: Opportunistic Aggregation over Duty Cycled Communications in Wireless Sensor Networks. In: Proceedings of the IPSN Track on Sensor Platform, Tools and Design Methods for Networked Embedded Systems (IPSN2008/SPOTS 2008), St. Louis, USA. IEEE Computer Society Press, Los Alamitos (2008)

Distributed Generation of a Family of Connected Dominating Sets in Wireless Sensor Networks Kamrul Islam, Selim G. Akl, and Henk Meijer School of Computing, Queen’s University, Kingston, Ontario, Canada K7L 3N6 {islam,akl,henk}@cs.queensu.ca

Abstract. We study the problem of computing a family of connected dominating sets in wireless sensor networks (WSN) in a distributed manner. A WSN is modelled as a unit disk graph G = (V, E) where V and E denote the sensors deployed in the plane and the links among them, respectively. A link between two sensors exists if their Euclidean distance is at most 1. We present a distributed algorithm that computes a family S of S1 , S2 , · · · , Sm non-trivial connected dominating sets (CDS) with the goal to maximize α = m/k where k=maxu∈V |{i : u ∈ Si }|. In other words, we wish to ﬁnd as many CDSs as possible while minimizing the number of frequencies of each node in these sets. Since CDSs play an important role for maximizing network lifetime when they are used as backbones for broadcasting messages, maximizing α reduces the possibility of repeatedly using the same subset of nodes as backbones. We provide an upper bound on the value of α via a nice relationship between all minimum vertex-cuts and CDSs in G and present a distributed (localized) algorithm for the α maximization problem. For a subclass of unit disk graphs, we show that our algorithm achieves a constant approximation factor of the optimal solution. Keywords: Wireless Sensor Network, Maximal Clique, Distributed Algorithm, Minimum Vertex-cut.

1

Introduction

The connected dominating set is one of the earliest structures proposed as a candidate for a virtual backbone in wireless networks [6,11]. Given a graph G = (V, E), where V and E are sets of nodes and edges respectively, a subset D of V is a dominating set if node u ∈ V is in D or adjacent to some node v ∈ D. A subset of V is a CDS if it is a dominating set and induces a connected subgraph. A minimum cardinality CDS is called an optimal CDS (or simply OP T ). Henceforth we use the words ‘optimal’ and ‘minimum’ interchangeably. In the context of sensor networks, a well studied problem is that of ﬁnding an OP T in a unit disk graph (UDG) which is often used to model connectivity in sensor networks where an edge between two nodes exists if and only if their Euclidean distance is at most 1. In this paper, we use UDGs to model sensor networks and take G to be a UDG. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 343–355, 2009. c Springer-Verlag Berlin Heidelberg 2009

344

1.1

K. Islam, S.G. Akl, and H. Meijer

A Motivating Example

We are motivated by an important observation for the general CDS construction in wireless sensor networks where the the main goal is to obtain a “small” approximation factor for the computed CDS, since ﬁnding a minimum cardinality CDS is NP-hard [5]. Consider the wheel graph in Figure 1 (a) with 9 nodes. Assume that each node has 9 units of energy and each transmission for a unit of data from a node to its neighbor consumes one unit of energy, neglecting the consumption of energy for message reception. The optimal CDS contains only node 9 at the center. Suppose each border node {1, 2, · · · , 8} has a unit of data to be broadcast via the center node 9 in each communication round. That is, whenever a node on the periphery sends a message, it is node 9 which broadcasts (relays) it and all other nodes receive the message. After eight rounds corresponding to eight border nodes transmitting their data through the center node, the center node will be left with 1 unit of energy while all the border nodes still contain eight units of energy each. After the 9th round, node 9 will have completely exhausted its energy and the other nodes will have 8 units of energy each for continuing operation. Thus ﬁnding an optimal solution might not always be helpful in scenarios like this and it is not hard to construct a family of such examples. One easy way to get around this problem is to devise algorithms that can in some way evenly distribute the energy consumption among the nodes in the network by not using the same subset of nodes for all broadcasting operations. Clearly this may not yield a ‘small-sized’ or constant factor approximation solution to the CDS problem but can be used to prolong the lifetime of the network which depends on the functioning of every node in the network. Using node 9 as a CDS for one broadcasting operation then leaving it free and engaging six consecutive nodes from the border nodes as next CDSs for successive rounds (See Figure 1(b), (c)) for each broadcasting clearly enhances the overall network lifetime. In other words, we have two scenarios. First, we can use only the middle node as the CDS which gives m/k = 1, where m is the number of CDSs generated and k is the maximum number of occurrences of a node in these CDSs. In the second scenario, we can use 3 CDSs where the nodes are taken from the border nodes only and one CDS containing only the middle node. Thus we get m/k = 4/3 which means that if the network is used for 4 rounds, for the ﬁrst scenario, the middle node is used for 4 rounds, and in the second scenario, no node is used for more than 3 rounds. In general, the above problem can be solved by ﬁnding a large number of disjoint connected dominating sets since doing so has a direct impact on conserving the energy of individual sensors and hence prolonging the network lifetime. However, as we will show, it is not always possible to ﬁnd a large number of such disjoint sets because the number of disjoint CDSs is bounded by the size of a minimum vertex-cut of the graph. Thus the main focus of this paper is to ﬁnd as many CDSs as possible while minimizing the frequency of participation of each node in the network. In other words, we would like to maximize α = m/k which would increase the overall network lifetime by incurring less energy consumption on the part of each node for broadcasting.

Distributed Generation of a Family of Connected Dominating Sets in WSNs

8

8

2

9 3 4

9

62 5

8 7

1

7

1

3 4

345

7

1 6 2

5

9 3 4

6 5

Fig. 1. Family of CDSs

We provide a novel distributed algorithm (that requires the knowledge of nodes at most 3-hops away) to compute a family of CDSs with a goal to maximize α. Our approach for obtaining such a family of sets depends on the computation of maximal cliques using only the one-hop neighborhood of each node combined with a simple coloring scheme for the nodes that requires each node to know information of its neighbors at most 3-hops. Maximal cliques in UDGs can be found in polynomial time in UDGs [5,19].

2

Related Work

There is a large body of literature [1,2,3,7,8,9,12,14] regarding ﬁnding the MCDS in wireless sensor networks which focuses on the construction of a single CDS and the eﬀort is to devise algorithms to produce a small approximation factor for it. Aside from looking for a single CDS with a small approximation factor, there has been some research towards ﬁnding a family of dominating sets [15,16,17,18] for data gathering and sensing operations in sensor networks. In [17], the authors consider the problem of ﬁnding a maximum number of disjoint dominating sets called the domatic partition. They show that every graph with maximum degree and minimum degree δ contains a domatic partition of size (1 − o(1))(δ + 1)/ln . They provide a centralized algorithm to produce a domatic partition of size Ω(δ/ln). The papers [16,15,18] are somewhat related to ours since they investigate the problem of maximizing the lifetime of wireless networks by generating disjoint dominating sets (not connected dominating sets) for data gathering purposes. The general focus is that the nodes in the dominating set act as “coordinators” for the network and the rest of the nodes remain in an energy-eﬃcient sleep mode. Thus to maximize the lifetime of the network, it is important to rotate the coordinatorship among the nodes in the network so that every node is roughly given an equal opportunity to go into sleep mode. In particular, the authors in [18] propose a centralized algorithm, without providing any worst case analysis or bound, to generate a number of disjoint dominating sets in order to cover a geometrical region by network nodes for as long as possible. A general version of the domatic partition called k-domatic partition (k ≥ 2) where in each dominating set a node is dominated by at least k nodes, is considered in [16]. Their algorithms construct a k− domatic partition of size at least a constant fraction of the largest possible (k − 1)-domatic partition.

346

K. Islam, S.G. Akl, and H. Meijer

In [15], the authors look at a variant of the domatic partition problem and propose a randomized algorithm for maximizing the network lifetime. The problem considered in [15] is called the Maximum Cluster-Lifetime problem where they aim to ﬁnd an optimal schedule consisting of a set of pairs (D1 , t1 ), · · · , (Dk , tk ) where Di is a dominating set (DS) and ti is the time during which Di is active. The problem then asks for the schedule S with maximum length L(S) such that i:v∈Di ti ≤ bv where bv is the maximum time that v can be in a DS. The randomized algorithm computes a schedule which is an O(log n)-approximation with high probability. Our approach diﬀers from those mentioned above in the way that we look to ﬁnd as many CDSs as possible with the aim to minimize the number of times each node participates in the network. To the best of our knowledge our distributed algorithm for generating such multiple CDSs for broadcasting is the ﬁrst of its kind. The rest of the paper is organized as follows. In Section 3 we provide deﬁnitions and assumptions that are used throughout the paper. The distributed algorithm is presented in Section 4. A theoretical analysis of our algorithm is presented in Section 5 followed by conclusions in Section 6.

3

Preliminaries

We assume that sensors are deployed in the plane and model a sensor network by an undirected graph G = (V, E) where V is the set of sensors and E represents the set of links (u, v) ∈ E between two sensor nodes u, v ∈ V if they are within their transmission range. We assume that every sensor node u ∈ V has the same transmission range which is normalized to 1. We deﬁne the neighbor sets N (u) and N [u] of sensor node u as N (u) = {v|(u, v) ∈ E, u = v} and N [u] = N (u) ∪ {u}. Each node u has an ordered triple, (ct(u), deg(u), id(u)) where the ﬁrst element ct(u) denotes the number of times u has participated in CDSs, deg(u) and id(u) denote the degree and id of u. Initially, ct(u) = 0, ∀u ∈ V , and this is updated as the algorithm is executed. For a maximal clique X ⊆ N [u] and u, v ∈ X, we say node u is lexicographically smaller than node v if (ct(u), deg(u), id(u)) < (ct(v), deg(v), id(v)), i.e., either ct(u) < ct(v) or ct(u) = ct(v) and deg(u) < deg(v) or ct(u) = ct(v) and deg(u) = deg(v) and id(u) < id(v). The rank of node u with respect to clique X denoted by r(X(u)) is the index of the lexicographically sorted (ascending order) triples of the nodes of X. Note that the rank r(X(u)) can be changed by the update of the variable ct(u). Our algorithm is local which can be deﬁned as follows: An algorithm is called k − localized if each node is allowed to exchange messages with its neighbors at most k−hops away and take decisions accordingly based on this information. We assume a synchronized message passing system as in [10,15,16] in which time is divided into equal slots. In each communication slot a node of the network graph is capable of receiving messages from its neighbors, performing local computations, and broadcasting a message to its neighbors. The time complexity of an algorithm is the number of time slots it needs to produce the desired result.

Distributed Generation of a Family of Connected Dominating Sets in WSNs

347

In principle, the message size can be arbitrarily large. Although we use a synchronous system, it is well known that a synchronous message passing algorithm can be turned into an asynchronous algorithm with the same time complexity, but with a larger message complexity. Sensors know their geographic locations and obtain their neighbors’ locations, degrees, ids and ct(.) values through exchanging “Hello” messages. 3.1

Problem Formulation

Given a connected UDG G = (V, E), we would like to ﬁnd a family S of subsets S1 , S2 , · · · , Sm such that i) ∀i, Si is a CDS ii) and α = m/k is maximized where k=maxu∈V |{i : u ∈ Si }|.

4

A 3-localized CDS Algorithm

We provide a distributed (localized in a strict sense) algorithm for generating a family of CDSs. The algorithm uses a simple coloring scheme where initially all sensors are white and when a sensor becomes a member of the CDS Si , it becomes black. Set Si , whose members are all black becomes the virtual backbone and remains active for a round (consisting of ﬁxed units of time) to perform broadcasting. After the round, all black nodes change their color into white. Then part of the algorithm is executed again and a new set Si+1 is produced which remains active for a round and so on. In this section, we present a simple algorithm for computing a family of CDSs in G which is based on coloring the nodes of G. Here we outline the algorithm. The algorithm consists of two phases described in what follows. 4.1

Dominating Set Generation: Phase I

In Phase I, each node u computes a maximal clique in N [u] following the polynomial time algorithm [19] which computes maximal cliques in UDGs. Let C(u) denote the maximal clique computed by u. Henceforth maximal cliques are just called cliques unless otherwise stated. Node u sends C(u) to all its direct neighbors v if v ∈ C(u). Similarly u receives a clique C(v) from v ∈ N (u) if u ∈ C(v). Denote the set of all such cliques received by u as C(N (u)), i.e., C(N (u)) = u∈C(v),v∈N (u) C(v). For each clique C + ∈ (C(u) ∪ C(N (u))), u computes its rank r(C + (u)) which gives u, |C(u) ∪ C(N (u))| ranks in total. For each C + , the smallest ranked node of C + becomes active by coloring itself black. This yields a set of black nodes B in G and we prove in the next section that B forms a dominating set. Now we need to add a subset of nodes C from the remaining white nodes in G and color them black to make B a CDS, i.e., B ∪ C becomes a CDS. In Phase II, we determine such set C . This phase of the algorithm which is executed at each node u, is given in Figure 2.

348

K. Islam, S.G. Akl, and H. Meijer

PHASE I: 1: Broadcast id(u), deg(u) and location to v ∈ N (u) 2: For each v ∈ N (u), receive id(v), deg(v) and v’s location 3: Compute C(u) and send C(u) to v ∈ C(u), v ∈ N (u) 4: Receive C(v), form C(N (u)) = u∈C(v),v∈N (u) C(v) 5: For each C + ∈ (C(u) ∪ C(N (u))), compute r(C + (u)) 6: Color u black if r(C + (u)) < r(C + (v)), ∀v ∈ C + \ {u}, ∃C + ∈ (C(u) ∪ C(N (u))) Fig. 2. Phase I

4.2

Connected Dominating Set Generation: Phase II

Recall that Phase I ﬁnds a dominating set B (consisting of black nodes) in G. In Phase II we compute a connected dominating set, that is, we select a set of connectors C to connect the nodes in B such that B ∪ C becomes a CDS. Since the distance between any two closest nodes in B can be at most three, we divide this phase into two cases where in Case 1 we discuss how two nodes in B which are two hops away from each other connect themselves via connectors. Case 2 is the same but such nodes are exactly three hops away from each other. Case 1: Let B(u, i) (resp. W (u, i)) be the set of all black (resp. white) nodes which are exactly i hops away from u, including u. The i hop distance between u and v is the minimum number of hops between them. If u is black then u sends B(u, 1) to all the white nodes W (u, 1). Node c ∈ W (u, 1) is called a connector if it connects two black nodes by turning itself black. Node c ∈ W (u, 1) receives B(u, 1) from each black member u ∈ B(c, 1) \ {c} and computes whether u∈B(c,1)\{c} B(u, 1) forms a connected induced subgraph. Let B ∗ (c) = u∈B(c,1)\{c} B(u, 1). Node c sends a ‘POS’ (positive) message to B(c, 1) if B ∗ (c) is connected. Otherwise, c sends B ∗ (c) to W (c, 1) and receives B ∗ (c ) from c ∈ W (c, 1). If there is no c such that B ∗ (c) ⊂ B ∗ (c ) (strict subset) then c issues a ‘NEG’ (negative) message to B(c, 1) \ {c}, otherwise c refrains from sending any messages. However, if B ∗ (c) = B ∗ (c ) for all c then c sends a ‘NEG’ message. However, if there is at least a ‘NEG’ message u receives from its white neighbors, W (u, 1), then u forms a list L(u) consisting of all received messages along with the corresponding ids of the senders. For example, if u receives ‘POS’ messages from nodes with id 12 and 4 and a ‘NEG’ from nodes 2 and 9 then L(u) = {(v12 (‘P OS ), v4 (‘P OS ), v2 (‘N EG ), v9 (‘N EG )}. Node u sends L(u) to its neighbors exactly 2 hops away. In the same way, u receives L(z) from z ∈ B(u, 2). If L(u) ∩ L(z) has at least one node with a ‘POS’ message then they are connected via some black nodes. Otherwise, u and z select the lexicographically smallest connector c by sending c a ‘YES’ message. After receiving ‘YES’ from u and z, c turns into black and joins the current CDS. See Figure 3 (a) and (b) for an illustration.

Distributed Generation of a Family of Connected Dominating Sets in WSNs i

i

f g

g c

e

a

c

c

d

f (c)

h (b)

e

(a)

e

j

a

h

c

b

b

d

j

a

f

b

d

349

a

d

e (d)

f

b

Fig. 3. (a) Case 1: B ∗ (h) = B ∗ (d) = B ∗ (j) = {a, e, b, c}, B ∗ (g) = {a, e}, and B ∗ (f ) = {a, e, i}. Only h, d, j, and f will each generate a ‘NEG’ message to their black neighbors while g sends a ‘POS’ message to a and e. For a and e, L(a) ∩ L(e) has a common sender g with a ‘POS’ message. They do nothing to ﬁnd connectors. Nodes b and e choose a node from {d, j}, while e and c take h as connectors which turn into black by receiving ‘YES’ from from b, c and e. (b) After execution of Case 1. (c) Case 2: a and b are 3-hops away. (d) They select c and d as their average ct(.) value is smaller than that of e and f .

Case 2: Let Puu denote a shortest path between u, u ∈ B where u ∈ B(u, 3) \ {u} and u is the lexicographically smaller than u. If there exists a path Puu consisting of exactly two intermediate black nodes then they do nothing since u and u are already connected via two black nodes. Otherwise Puu consists of i) a connector c and a white node w and/or ii) two white nodes w , w as intermediate nodes. Nodes u and u choose w and c if there is only one type (i) path or if there are more than one such paths, then they select the one with nodes having the smallest (ct(c) + ct(w))/2. Similarly if there is no type (i) path, they select the path with nodes having the smallest (ct(w ) + ct(w ))/2. In both situations ties are broken arbitrarily. This is done when u and u both send ‘YES’ message to w in type (i) or to w and w in type (ii) paths. After receiving a ‘YES’ message, node w for a type (i) path or nodes w and w for a type (ii) path join the current CDS by coloring themselves black. If C is a set of all such w or w and w then B ∪ C forms a CDS we are looking for. See Figure 3 (c) and (d) for an example. Thus the execution of Phases I and II forms a CDS Si which consists of the nodes (black) and these nodes remain active for a certain amount of time (a round) for broadcasting while the white nodes stay in the energy-eﬃcient sleep mode. Phase II is given in Figure 4 and a subroutine (called from Phase II) to change color is provided in Figure 5. After serving for a round, all nodes in the CDS, i.e., black nodes, update their ct(.) values by one, send the updated values to their neighbors and change color into white. All nodes in G then start executing from Line 5 Phase I and then Phase II with their updated triples. This generates a new CDS Si+1 consisting of black nodes, which remains active for another round. Then ct(.) values are updated and a new set Si+2 is constructed, and so on.

350

K. Islam, S.G. Akl, and H. Meijer

PHASE II: 1: If u is black Then Send B(u, 1) to 1-hop connectors C Endif 2: If u is white Then 3: If B ∗ (u) is connected Then Send ‘POS’ to B(u, 1) \ {u} 4: Else //u sends B ∗ (u) and receives B ∗ (u ) 5: If u s.t. B ∗ (u) ⊂ B ∗ (u ), u ∈ W (u, 1) \ {u} Then Send ‘NEG’ to B(u, 1) Endif Endif 6: Endif 7: If u is black Then Receive msgs from C 8: If msg=‘NEG’ Then 9: Send L(u) to u ∈ B(u, 2) and receive L(u ) 10: Send ‘YES’ to the lexicographically smallest node v Endif 11: Endif call color() 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:

If Puu has 2 black nodes Then do nothing Else //u ∈ B(u, 3) and lexicographically smaller than u If Puu is type(i) path Then Select w, c ∈ Puu with the smallest (ct(c) + ct(w))/2 Else //Puu is type(ii) path Select w , w ∈ Puu with the smallest (ct(w ) + ct(w ))/2 Endif Send w or w , w ‘YES’ to color them black Endif call color() If u is black Then Remain active for a round Send ct(u) = ct(u) + 1 to N (u) Endif If u is black Then Turn u white Endif Fig. 4. Phase II

subroutine color () 1: If u is white Then 2: If it receives ‘YES’ 3: Turn into black Endif Endif Fig. 5. Subroutine color

Figure 6 illustrates the construction of a CDS by coloring certain nodes black. The numbering of nodes denotes the id of the corresponding node and the number inside the bracket indicates the nodes’ ct(.) values. Initially all ct(.) values are set to zero and they are incremented by one each time the respective nodes participate in the CDS. In this example we can construct six diﬀerent sets.

Distributed Generation of a Family of Connected Dominating Sets in WSNs 7(0)

1(1)

5(0) 3(0) 1(2)

4(1) 7(3)

5(2) 3(2)

4(3)

2(1)

1(1)

7(1)

5(0)

6(0)

3(1) 2(3)

6(3)

1(2)

4(1) 7(2)

5(1) 3(2)

4(3)

2(1)

6(1)

1(1)

7(1)

5(1) 3(1)

2(3)

6(2)

1(2)

4(2) 7(2)

5(1) 3(1)

4(2)

351

2(2)

6(1) 2(2)

6(2)

Fig. 6. Illustration of the algorithm to produce a family of CDSs

5

Theoretical Analysis

In this section we give an overview of the theoretical analysis of our algorithm. First we show that Phase I computes a dominating set and then we prove that Phase II forms a CDS. We analyze some other characteristics of our algorithm and provide a relationship between the CDSs and minimum vertex-cuts of G. Lemma 1. The set of black nodes B produced in Phase I forms a dominating set in G. Proof. We prove the claim by contradiction. Consider a node u. If it is black or connected to a black node then we are done. Otherwise, all nodes in N [u] are white. Let the clique computed by u be C(u) ⊆ N [u]. Since each node in C(u) sorts the ordered triples lexicographically, each of them has a unique rank from {1, · · · , |C(u)|}. The node with the least rank in C(u) must be black, a contradiction.

Lemma 2. The set of black nodes B ∪ C produced in Phase II forms a CDS in G. Proof. Assume the subgraph G induced by B∪C is not connected. Consider two neighboring components of G each containing a node from B and the shortest distance between them is at most three (since B is a dominating set in G). Let u and v be two such nodes. Each of them ﬁnds connectors to connect to the nodes in B (nodes at most three hops away from it) which are lexicographically smaller than it. Since lexicographically each node is unique, one of the nodes of u and v must be smaller than the other. Then they must select at most two connectors to join them. If they do not ﬁnd such connectors then either the shortest distance between them is not at most three or they are already connected by some connectors. In both cases it is a contradiction.

5.1

Time and Message Complexities

Each node requires only the information of its three-hop neighborhood. Collecting information from a three-hop neighborhood can be done in constant time. Thus the time complexity of the entire algorithm is constant since local computation time (the time spent to compute the cliques in each node’s neighborhood) is negligible in synchronous systems [10].

352

K. Islam, S.G. Akl, and H. Meijer

For the message complexity analysis we see that in Phase I each node sends a single message (containing its id, position) to its neighbors, thus generating O(n) messages in total for the network. After computing a clique in its 1-hop neighborhood node u sends this information (id of the nodes in the clique) in a single message to its neighbors and similarly receives cliques from its neighbors. This again makes a total of O(n) messages. Moreover, O(n) messages are transmitted in Phase II in Lines 1-6 to know about 1-hop connectors. For Lines 7-11 in Phase II, each node sends a single message, and also relays at most 1 message (that is, it receives all messages from its neighbors and merge them into a single message and sends) totaling O(n) messages. For the rest of the algorithm in Phase II, that is, in Lines 12-23, there are O(n) messages generated (a node relays at most two messages and generates its own message). These many messages are required to explore black nodes which are 3-hops away from a certain node. This complexity arises for one single round. Since there can be at most rounds, the message complexity becomes O(n). Thus the message complexity of the entire algorithm is O(n). 5.2

Relation between Minimum Vertex-Cuts and CDSs

We prove that the maximum value of m is upper bounded by a number that results from a relation between minimum vertex-cuts and any CDS (including OP T ) in a graph. A vertex-cut of G is a set of vertices whose removal renders G disconnected. A vertex-cut of minimum cardinality is called a minimum vertexcut, κ(G). Let K(G) denote the family of all minimum vertex-cuts of G. We show that for any κ(G) ∈ K(G), ∃a ∈ κ(G) such that a ∈ Si where Si denotes any CDS of G. In other words, any CDS must take at least one element from each of the minimum vertex-cuts. Instead of “vertex-cut” we use “node-cut” in order to be consistent with the terminology of the paper. Theorem 1. In a connected graph G, any CDS (including OP T ) must contain at least one node from each κ(G) where κ(G) ∈ K(G) unless G is a complete graph which has no cuts at all. Proof. Consider any minimum node-cut κ(G) ∈ K(G) and suppose for the sake of contradiction that an arbitrary CDS S ∗ of G (S ∗ can be OP T ) does not contain any node from κ(G). Let G1 and G2 be two connected induced subgraphs generated by the removal of κ(G). S ∗ is a CDS and does not contain any node from κ(G) so components G1 and G2 must each have its own CDS, name them S ∗ (G1 ) ⊆ S ∗ and S ∗ (G2 ) ⊆ S ∗ , respectively. However, S ∗ (G1 ) ∪ S ∗ (G2 ) is not connected since the only nodes that could connect S ∗ (G1 ) and S ∗ (G2 ) belong to κ(G), thus contradicting the fact that S ∗ is a CDS.

Here we show a relationship between the number of minimum node-cuts and the cardinality of OP T . The following corollaries are directly deducible from the above theorem. Corollary 1. In G, |OP T | ≥ |K(G)| if and only if κi (G) ∩ κj (G) = φ for all i, j, 1 ≤ i, j ≤ |K(G)|.

Distributed Generation of a Family of Connected Dominating Sets in WSNs

353

Corollary 2. If G is a tree, then every internal node is a minimum node-cut. Therefore, |OP T | = |K(G)|=number of internal nodes in G.

Now we have the following lemma. Lemma 3. The optimal value of α, αopt is bounded by the size of a minimum node-cut in K(G). Proof. Let M be the set of CDSs the optimal algorithm ﬁnds in G and κ(G) be a minimum node-cut in K(G). Recall that all κ(G)s in K(G) are of the same size, otherwise they would not be minimum node-cuts. Since every CDS in M must use at least one node from κ(G) (Theorem 1), the maximum value of k (i.e., the maximum number of occurrences of a node in M ) will be at least |M |/|κ(G)|. Hence, αopt ≤ |M |/(|M |/|κ(G)|) = |κ(G)| and the claim follows.

Now we analyze the performance of our algorithm towards optimizing the value of α = m/k for a subclass of unit disk graphs. We consider a particular subclass of unit disk graphs where each node u ∈ V belongs to exactly one clique (recall clique means maximal clique). That is, each node computes a clique C(u) but does not receive any clique C(u ) = C(u) from its neighbors (in other words, |C(u) ∪ C(N (u))| = 1). There are two constants c and d. For any node u, assume its neighbors belong to at most in c diﬀerent cliques and for d we deﬁne the following. Let the set of neighbors of u which are in cliques C(u ) = C(u) be N (u). That is, N (u) = {u |u ∈ N (u), u ∈ C(u ), C(u ) = C(u)}. Let the number of neighbors of u which are in cliques C(u ) = C(u) be bounded by d, that is, |N (u)| ≤ d. Denote by αalg the value of α obtained by our algorithm. Lemma 4. For this subclass (mentioned above) of unit disk graphs αalg ≥ 1/(c∗ (d + 1)) ∗ αopt . Proof. Let there be s cliques computed by the nodes in V and without loss of generality assume all cliques have the same size, p > 2. Note that the maximum number of disjoint CDSs possible in this setting can be at most p, implying that the optimal value αopt can be also at most p. This is because generating the (p + 1)-th CDS must reuse some of the nodes already used in previous CDSs making k > 1 (recall k is the maximum number of occurrences of a node in the generated connected dominating sets CDSs). Thus αopt ≤ p. In order to ﬁnd a CDS, our algorithm ﬁrst selects a node from each of the s cliques in Phase I. Let u be the node selected from clique C(u). Since u’s neighbors can be at most in c diﬀerent cliques, u requires at most c−1 additional nodes from C(u) that act as connectors to connect to c − 1 other cliques. Thus the algorithm uses at most c nodes from C(u) for each CDS. Therefore, our algorithm can compute at least 1/c CDSs. Now we ﬁgure out the value of k, that is, the upperbound of how many times a node participates in diﬀerent CDS. Since |N (u)| ≤ d, u can be selected at most d + 1 times, once it is selected in Phase I, other times (d times) it is selected (in Phase II) as a connector between two nodes in diﬀerent cliques. Hence k ≤ (d + 1). Therefore, αalg ≥ 1/(c ∗ (d + 1)) ∗ αopt . An example of this subclass is given in Figure 7 with c = 2 and d = 2.

354

K. Islam, S.G. Akl, and H. Meijer

connectors u

(a)

C(u)

(b)

Fig. 7. Illustration of Lemma 4. (a) Optimal algorithm chooses any column of nodes as a CDS each time (shown in black) whereas our algorithm (b) ﬁrst chooses black nodes from each clique and then uses at most two additional (shown in brown) nodes from each clique to make them connectors between cliques.

Theorem 2. For a complete graph Kn , our algorithm achieves αalg = αopt = n. Proof. For graph G = Kn , initially in Phase I, the algorithm selects the node with the lowest id by coloring it black since all nodes have the same ct(.) value and degree. This node updates its ct(.) value by one. In Phase II, the black node receives all ‘POS’ messages from other n − 1 nodes and ignores all such messages and so no connector node is required. Thus we have exactly one node as CDS, i.e., |S1 | = 1. In the next round, the second lowest id node is selected and used as the only node in the CDS (|S2 | = 1) and so on. Thus we have a family S1 , S2 , · · · , Sm of CDSs where each set Si consists exactly of one node of Kn , all nodes in Kn are used in CDSs and Si = Sj , i, j ∈ [1, m]. This means m = n and k = 1. Therefore, αalg = αopt = n.

6

Conclusions

In this paper we have presented a distributed (localized) algorithm for generating a family of connected dominating sets in sensor networks with a view to increasing the lifetime of nodes in the network. Using the same CDS (even an optimal one) in sensor networks is not a viable option since the nodes in the set quickly deplete their energy due to their participation in every broadcast. Inspired by this observation, we develop a distributed algorithm which takes care of this situation by generating other connected dominating sets so that the burden of broadcasting can be distributed over the nodes. We show an interesting and important relationship between minimum vertex-cuts and CDSs where the cardinality of a minimum vertex-cut limits the number of disjoint CDSs. We believe this is the ﬁrst distributed algorithm of its kind in the context of wireless sensor networks since previous algorithms mainly focus on generating a single CDS of small size. Our algorithm obtains a constant factor approximation for a small

Distributed Generation of a Family of Connected Dominating Sets in WSNs

355

subclass of unit graphs, however it remains as an interesting open problem to devise an algorithm (centralized or distributed) with a constant approximation factor for unit disk graphs.

References 1. Alzoubi, K.M., Wan, P.J., Frieder, O.: New Distributed Algorithm for Connected Dominating Set in Wireless Adhoc Networks. In: Proc. IEEE HICS (2002) 2. Alzoubi, K.M., Wan, P.J., Frieder, O.: Distributed Heuristics for Connected Dominating Set in Wireless Adhoc Networks. J. on Communication Networks 4(1), 22–29 (2002) 3. Cheng, X., Huang, X., Li, D., Du, D.: Polynomial-time Approximation Scheme for Minimum Connected Dominating Set in Adhoc Wireless Networks. In: Proc. Networks (2006) 4. Chen, D., Mao, X., Fei, X., Xing, K., Liu, F., Song, M.: A Convex-Hull Based Algorithm to Connect the Maximal Independent Set in Unit Disk Graphs. In: Proc. Wireless Algorithms, Systems, and Applications, pp. 363–370 (October 2006) 5. Clark, B., Colborn, C., Johnson, D.: Unit Disk Graphs. Discrete Mathematics 86, 165–177 (1990) 6. Das, B., Bharghavan, V.: Routing in Adhoc Networks Using Minimum Connected Dominating Sets. In: Proc. ICC, pp. 376–380 (1997) 7. Funke, S., Kesselman, A., Meyer, U., Segal, M.: A Simple Improved Distributed Algorithm for Minimum Connected Dominating Set in Unit Disk Graphs. ACM Transactions on Sensor Networks 2(3), 444–453 (2006) 8. Guha, S., Khuller, S.: Approximation Algorithms for Connected Dominating Sets. Algorithmica, 374–387 (1998) 9. Marathe, M., Breu, H., Hunt, H., Ravi, S., Rosenkrantz, D.: Simple Heuristics for Unit Disk Graphs. Networks 25, 59–68 (1995) 10. Peleg, D.: Distributed Computing: A Locality-Sensitive Approach. SIAM, Philadelphia (2000) 11. Sivakumar, R., Das, B., Bharghavan, V.: Spine Routing in Adhoc Networks Using Minimum Connected Dominating Sets. In: Proc. ICC, pp. 376–380 (1997) 12. Stojmenovic, I., Seddigh, M., Zunic, J.: Dominating Sets and Neighbor Elimination Based Broadcasting Algorithms in Wireless Networks. IEEE Transactions on Parallel and Distributed Systems 13(1) (January 2002) 13. Wan, P., Alzoubi, K., Frieder, O.: Distributed Construction of Connected Dominating Set in Wireless Adhoc Networks. In: Proc. Infocom (2002) 14. Wu, J., Li, H.: On Calculating Connected Dominating Sets for Eﬃcient Routing in Adhoc Wireless Networks. In: Proc. Discrete Algorithms and Methods for Mobile Computing, pp. 7–14 (1999) 15. Moscibroda, T., Wattenhofer, R.: Maximizing the Lifetime of Dominating Sets. In: Proc. WMAN (2005) 16. Pemmaraju, S., Pirwani, I.: Energy Conservation in Wireless Sensor Networks via Domatic Partitions. In: Proc. MobiHoc (2006) 17. Feige, U., Halldorsson, M., Kortsarz, G., Srinivasan, A.: Approximating the Domatic Number. SIAM Journal of Computing 32(1), 172–195 (2003) 18. Slijepcevic, S., Potkonjak, M.: Power Eﬃcient Organization of Wireless Sensor Networks. In: Proc. Infocom (2001) 19. Gupta, R., Walrand, J., Goldschmidt, O.: Maximal Cliques in Unit Disk Graphs: Polynomial Approximation. In: Proc. INOC Lisbon, Portugal (2005)

Performance of Bulk Data Dissemination in Wireless Sensor Networks Wei Dong1 , Chun Chen1 , Xue Liu2 , Jiajun Bu1 , and Yunhao Liu3 1

Zhejiang Key Laboratory of Service Robots, College of Computer Science, Zhejiang University 2 School of Computer Science, McGill University 3 Dept. of Computer Science, Hong Kong University of Science and Technology {dongw,chenc,bjj}@zju.edu.cn, [email protected], [email protected]

Abstract. Wireless sensor networks (WSNs) have recently gained a great deal of attention as a topic of research, with a wide range of applications being explored. Bulk data dissemination is a basic building block for sensor network applications. The problem of designing efficient bulk data dissemination protocols has been addressed in a number of recent studies. The problem of mathematically analyzing the performance of these protocols, however, has not been addressed sufficiently in the literature. In this work, we show a way of analyzing mathematically the performance of bulk data dissemination protocols in WSNs. Our model can be applied to general networks by use of the shortest propagation path. Our model is accurate by considering topological information, impact of contention, and impact of pipelining. We validate the analytical results through detailed simulations, and we find the analytical results fit well with the simulation results. Further, we demonstrate that the analytical results can be used to aid protocol design for performance optimizations, e.g., page size tuning for shortening the completion time.

1 Introduction Wireless sensor networks (WSNs) have recently gained a great deal of attention as a topic of research, with a wide range of applications being explored. Bulk data dissemination is a basic building block for sensor network applications. Example protocols such as Deluge [1] and MNP [2], which distribute new binaries into a network, enabling complete system reprogramming. The problem of designing efficient bulk data dissemination protocols has been addressed in a number of recent studies [1,2,3,4,5]. Although they demonstrated the efficiency of their approaches through testbeds or simulations, they failed to investigate rigorously the performance of these protocols because of evaluation limitations. For example, testbeds suffer from scalability issues while simulations can either be inaccurate (e.g., packet-level simulation such as ns-2 cannot capture hidden terminal problems) or time-consuming (e.g., bit-level simulation such as TOSSIM [6] can be really slow). Both of these limitations call for the need of efficient and accurate performance modeling and analysis, which could be useful to protocol design (e.g., for parameter tuning), performance evaluations and optimizations. B. Krishnamachari et al. (Eds.): DCOSS 2009, LNCS 5516, pp. 356–369, 2009. c Springer-Verlag Berlin Heidelberg 2009

Performance of Bulk Data Dissemination in Wireless Sensor Networks

357

The problem of mathematically analyzing the performance of these protocols, however, has not been addressed sufficiently in the literature. The analysis of [1] only applies to linear structures. The analysis of [4] is far from accurate because it does not consider protocols details. The analysis of [7] also suffers from inaccuracy issues because, for example, it does not model page pipelining (i.e., the data image is segmented into fixed-sized pages to allow spatial multiplexing), which is a common technique for bulk data dissemination. In this work, we show a way of analyzing mathematically the performance of bulk data dissemination protocols in WSNs. In particular, we analyze the Deluge protocol, the de facto bulk data dissemination protocol based on TinyOS [8]. Our model can be applied to general networks by use of the shortest propagation path. Our model can also be applied to other dissemination protocols (e.g., MNP) by substituting the protocolspecific factors. Finally, our model is accurate by considering topological information, impact of contention, and impact of pipelining. We validate the analytical results through detailed simulations, and we find the analytical results fit well with the simulation results. Further, we demonstrate that the analytical results can be used to aid protocol design for performance optimizations, e.g., page size tuning for shortening the completion time. The contributions of our work are highlighted as follows: – To the best of our knowledge, we are the first to propose a general and accurate model for bulk data dissemination in WSNs. – We validate the analytical results through extensive simulations, and our analytical results fit well with the simulation results. – We demonstrate how our approach can be used to aid protocol design for performance optimizations. The rest of this paper is structured as follows: Sec 2 gives an overview of our approach. Sec 3 describes two common behaviors in bulk data dissemination protocols. Sec 4 presents the details of our analytical approach. Sec 5 shows the experiments and evaluation results. Sec 6 discusses the related work. Finally, Sec 7 gives a conclusion of this paper.

2 Overview In this section, we formally present the problem we are addressing, and present our approach towards solving it. 2.1 Problem Formulation We are interested in determining the completion time of a given page for any sensor node in the network. More formally, given a data image of P pages and a network of N sensor nodes, we are interested in estimating the completion time of a given page under a specific segmentation scheme (e.g., one page defaults to 1104 bytes in Deluge [1]).

358

W. Dong et al. Sec 5.1: measurement

Profiling (one time)

Target network

Sec 2,4: PHY model

PHY model Protocol model

Sec 4: Protocol model

Model for single-hop transmission

Sec 4.2: single-hop model

Model for multihop bulk data dissemination

Sec 4.1: multihop model

Sec 5: Validation of solutions

TOSSIM simulation results

Measurements (RSS | LQI | BER)

Validation

Fig. 1. Approach overview

We will use the denotation Ckp to designate the completion time of the p-th page at node k. Thus, we are interested in determining the completion time, Ckp , for all 1 ≤ k ≤ N and 1 ≤ p ≤ P. The completion time of a given page for a sensor node depends on the following factors. (1) The completion time of the page for the previous-hop node; this translates to finding the propagation path and determining the completion times for all the nodes along the path; (2) The single-hop propagation time of the page; this requires capturing protocol details and effects of contention to avoid unrealistic assumptions. Our goal is to develop an efficient and accurate model that captures the propagation behavior of bulk data dissemination protocols in terms of completion time at the per page basis. 2.2 Approach Overview A high level block diagram of our approach is shown in Fig 1 with pointers to sections where different parts are described in this paper. The centerpiece is a protocol model. The protocol model models two behaviors that bulk data dissemination depends on. (1) The multihop propagation behavior for all the pages; (2) The single-hop propagation behavior for one page. The multihop propagation behavior depends on single-hop propagation behaviors amongst other factors such as propagation path and impact of pipelining. The single-hop propagation behavior again depends on the physical-level model such as packet reception rate (PRR). The physical-level model captures PRR of a single link. We use the same approach as in [9] to model this dependency, i.e., via measurements in a one-time profiling experiment. The profiling is done for each sensor node platform, and can be reused. The physical-level model is seeded by link-wise measurements such as RSSI and LQI in the target WSN. This seeding now makes the upper layer models amenable to numeric solutions. We validate the entire approach by comparing the completion times estimated via this modeling approach with TOSSIM simulation results.

Performance of Bulk Data Dissemination in Wireless Sensor Networks Hop 1

Source

1

page 2

Hop 2

2

Hop 3

3

359

Hop 4

4

page 1

5

Fig. 2. Pipelining. Example four-hop network. A three-hop spacing is required for effective spatial multiplexing. In this example, a simultaneous broadcast from 1 and 3 would collide at 2.

3 Modeling Dissemination Behaviors In this section, we describe the behaviors of typical bulk data dissemination protocols for WSNs. There are two typical behaviors that are common for most bulk data dissemination protocols, i.e., segmentation/pipelining and negotiation on a page basis. 3.1 Segmentation and Pipelining Most bulk data dissemination protocols [1,2,4,5] take advantage of pipelining to allow parallel transfers of data in networks. Pipelining is done through segmentation: a program image is divided into several pages, each of which contains a fixed number of packets. Instead of completely receiving a whole program image before forwarding it, a node becomes a source node after it receives only one complete page. For transmission of a large program image, pipelining could increase overall throughput significantly [3]. In order to realize the full benefit of spatial multiplexing, the protocol need to ensure that transfers of different pages do not interfere with each other. Fig 2 shows that the concurrent transmission of page 2 and page 1 must be at least three hops away to avoid collisions. 3.2 Negotiation Most bulk data dissemination protocols [1,2,4,5] use negotiation-based approach [3] to avoid the broadcast storm problem. A simple negotiation protocol contains three types of messages [3]: (1) ADV: the source node advertises its received object (metadata); (2) REQ: its neighbors send back requests after receiving the ADV to notify the source node about which objects are needed; (3) DATA: the source node only sends out the requested objects. Fig 3 shows the state transition diagram of a typical negotiation-based dissemination protocol, Deluge. Negotiation helps Deluge minimize the set of senders to advertise in a time interval. It further reduces the number of senders by giving higher priority to a node that transmits a page with a smaller page number [3].

4 Analytical Approach In this section, we present the details of our analytical approach. We describe the multihop model and single-hop model in Sec 4.1, 4.2 respectively, which constitute the protocol model for the Deluge protocol. In Sec 4.3, we show the extensibility of our approach by extending our model to another dissemination protocol, MNP.

360

W. Dong et al. Not overhear inconsistent ADV

Overhear ADV for a smaller page

MAINTAIN

Receive ADV

Receive REQ

Page is complete Finish sending

RX

Receiving DATA in a page

TX

Sending DATA in a page

Fig. 3. State transition diagram of the Deluge protocol

4.1 Modeling Multihop Propagations To determine the completion times of a given page for a sensor node, Ckp , we need to find the propagation path originated from the source node, s, and determine the completion times of all the nodes along the path, by considering the impact of pipelining. We will use Ti pg j to denote the propagation time for one page from node i to node j. The computation of Ti pg j will be described in the next subsection. Because a page transmission starts when any of the upstream nodes completes the page, we compute the propagation path as the shortest path from the source node, where Ti pg j denotes the cost metric between a pair of neighboring nodes. Thus, we employ the Dijkstra algorithm [10] to determine the propagation path to any node, say node k, in the network. We denote this path as PTk = (n1 , n2 , . . . , n|PTk | ). Next, we will compute the completion times for all the nodes along the path by considering the impact of pipelining. We will use Cqp to denote Cnpq (i.e., the completion time of the p-th page at the q-th node along the path PTk ) to simplify the notations when no ambiguity occurred. The completion time, Cqp , depends on two factors: (1) it is no p pg smaller than Cq−1 +T(q−1)q , i.e., the reception time of the current page from the previous p−1

p−1

hop; (2) it is no smaller than Cq+3 when q + 3 ≤ |PTk |, or C|PT | when q + 3 > |PTk |, k because the concurrent transmissions of page p − 1 and p should beyond at least three hops away to avoid collisions. Thus, p pg Cqp = max(Cq−1 + T(q−1)q ,Cqp−1 )

(1)

where q = min(q + 3, |PTk |). We start with C11 = Cq0 = C0p = T01pg = 0, and recursively compute the completion times according to Eq. 1. Finally, it should be noted that the computation of the completion time, Ckp , may be involved in the computation of completion times for other nodes, where node k acts as an intermediate node in the propagation path. We always select the maximum completion time because any transmissions at the downstream nodes of node k will postpone the transmission to node k.

Performance of Bulk Data Dissemination in Wireless Sensor Networks

361

4.2 Modeling Single-Hop Propagations In the previous subsection, we have used the single-hop page transmission time to seed the multihop propagation model. In this subsection, we will determine the single-hop transmission time, Ti pg j , based on the pair-wise PRR value given by the the physicallevel model. The computation of Ti pg j was preliminarily analyzed in [1] for linear structures. Compared to prior work, our model has three important extensions: (1) our model considers pair-wise PRR value; (2) our model calibrates constant parameters in [1] by considering its relationship to PRR; (3) our model considers transmission contentions. All these extensions are necessary to make it applicable to general networks. The page transmission time from node i to node j, Ti pg j is the sum of time spent in advertisement, request, and data transmission, respectively. That is, req adv data Ti pg j = Ti j + Ti j + Ti j

(2)

The expected time for node j to receive advertisement from node i is given by, Tiadv j =

1 τl (1 + Nsupp) PRRi j

(3)

where τl is the approximate time interval between two advertisements; PRRi j is the packet reception rate from node i to node j; and Nsupp is the expected number of advertisement suppressions. To determine PRRi j , we need to create an empirical relationship to a specific measurement between the two nodes. We express this relationship as a function f (·), such that PRRi j = f (Mi j ), where Mi j denotes a certain measurable metric between two nodes, e.g., RSSI or LQI. We determine f (·) from a prior profiling study. Note that this function models hardware properties rather than wireless propagation in an actual deployment. Thus, such prior profiling study is possible. The profiling methodology to determine f (·) will be discussed in Sec 5.1. We set Nsupp = 1 as reported in [1]. The expected time for node j to send requests is given by, Nreq τr Tireq = N + − 1 Tiadv (4) req j j 2 λ where τ2r is expected time between two requests; Nreq is the expected number of requests. We note that in Deluge, when a node exceeds its limit of λ requests, it must transition to the MAINTAIN state and wait for another advertisement before making additional requests. Thus, (Nreq /λ − 1)Tiadv is the expected amount of time spent j waiting for additional advertisements when the request limit of λ is exceeded. In the analysis of [1], the value of Nreq is experimentally fixed as 5.4. We experimentally find, however, that the value of Nreq is related to the weighted average PRR of a given network. The details will be deferred to Sec 5.2. The expected time for node i to transmit a page to node j is given by, Tidata = N pkts j

1 Tpkt ρi PRRi j

(5)

362

W. Dong et al.

where N pkts is the number of packets in a page; PRRi j is the the packet reception rate from node i to node j as mentioned earlier; Tpkt is the transmission time for a single packet; and finally, ρi is the number of contending nodes of i, and it models the MAC delay. We define this parameter as the effective number of neighboring nodes of node i. More formally, ρi = ∑ PRRki . (6) k

4.3 Extension to MNP It should be noted that our analytical model can be easily extended to bulk data dissemination protocols other than Deluge [1], by substituting protocol-specific factors. In this subsection, we extend our analytical method to a different dissemination protocol, i.e., MNP [2], by substituting the single-hop propagation time Ti pg j . For MNP, the single-hop propagation time Ti pg is computed as follows [7], j adv data Ti pg j = Ti j + Ti j

(7)

The time for requests is not included in. The reason is that in MNP the advertising node sends a maximum number of κ advertisement messages to allow sender selection, thus by the end of the waiting interval Tiadv j , the advertising node has received at least one request messages and is ready to serve a receiving node. The expected time for κ advertisement messages is given by, Tiadv j = κ ·Δτ

(8)

where Δ τ is the average duration between two advertisements. After Tiadv j , the source sends the N pkts of a page. It also sends a Start Download and End Download message signifying the start and end of each page. Thus, the expected time for data transmission of a page is given by, Tidata = (N pkts + 2) j

1 Tpkt ρi PRRi j

(9)

5 Experiments and Evaluations We use a bit-level simulator, TOSSIM [6], to conduct experiments and evaluations. In Sec 5.1, we describe the profiling experiment to determine the PRR value per link. In Sec 5.2, we present our parameter settings. Specifically, we describe how we determine the number of request messages as mentioned in Sec 4. In Sec 5.3, we validate analytical results in both square structures and linear structures. In Sec 5.4, we show the computational time of our approach as compared to simulation time and actual dissemination time. Finally, in Sec 5.5, we demonstrate how our analytical model can be used to aid protocol design for performance optimizations, e.g., page size tuning for shortening the completion time.

Performance of Bulk Data Dissemination in Wireless Sensor Networks

363

3553DFNHW5HFHSWLRQ5DWH

%(5%LW(UURU5DWH

Fig. 4. The correlation between BER and PRR in TOSSIM, f3 (·) : BER → PRR

5.1 Profiling Experiments In order to determine the profile: f (·) : measurements → PRR, we need to do a set of measurement experiments. For a given radio, we can choose a specific measurable metric, e.g., RSSI or LQI. Their suitability in correlation to PRR is discussed in [11]. The readers are referred to [11] for the correlation between RSSI and PRR and the correlation between LQI and PRR. The RSSI/LQI values can be measured by having each node taking in turn and sending a set of broadcast packets. For a given broadcasting sender, rest of the nodes record the measurements. For an N node network, the measurement requires O(N) measurement steps and provides the metrics for all the N(N − 1) links. In the following, we will use TOSSIM for validation. As TOSSIM use the bit error rate (BER) to describe a link, we determine the profile: f3 (·) : BER → PRR, via the CountRadio benchmark in the TinyOS distribution [8]. Fig 4 plots the correlation between BER and PRR in TOSSIM. 5.2 Parameter Settings We set τl = 2 (s), τr = 0.5 (s), λ = 2, as in [1]. We set Tpkt = 0.018 (s) as derived from experiments in TOSSIM. Note that this value is close to the maximum CC1000 (used bit/byte in Mica2 mote) packet transmission time, i.e., 36 bytes/pkt×8 = 0.015 s/pkt. 19.2 Kbps We have also mentioned that the value of Nreq is related to the weighted average PRR of a given network. In the following, we will experimentally determine the value of Nreq . Intuitively, the value of Nreq is related to the average PRR of a network as in a lossy network the value of Nreq will be large while vice verse. In order to calculate the average PRR of a network, we first setup a cutoff value to exclude poor links. This value is set to 0.1 as the same in [12]. However, we find that the average PRR cannot directly reflect the “connectivity” of the network. For example, we find that the average PRR for 10x2 linear structure with 10 feet spacing (hereafter, we will use “d1”x“d2”x“spacing” to denote a grid topology) is even lower than 10x2 linear structure with 15 feet spacing. The reason is that the network 10x2-10 also has a large number of intermediate links which makes the average PRR small.

364

W. Dong et al. Table 1. The weighted average PRR of a network Topology weighted average PRR Topology weighted average PRR 10x10-5 0.918 10x2-5 0.98 10x10-10 0.825 10x2-10 0.916 10x10-15 0.784 10x2-15 0.86 10x10-20 0.428 10x2-20 0.61

1UHT

355

Fig. 5. Relationship of PRRnetwork and Nreq

In order to address the above issue, we use the weighted average PRR of a network. We divide PRR into several discrete regions, e.g., (0.1, 0.2], . . . , (0.9, 1]. Each region is associated with a weight that is defined as follows: weight=(the number of links whose PRR falls within that region)/the number of total links. We associate each PRR with a weight function w(·) : PRR → weight. The weight of a PRR is the weight of the region the PRR falls within. Thus, PRRnetwork =

∑

PRRi j · w(PRRi j )

(10)

i, j∈network

Table 1 shows the weighted average PRR of some typical networks. Finally we run microbenchmarks to determine the relationship of PRR to Nreq . We setup two nodes running the Deluge protocol under different PRR settings, and the result is plotted in Fig 5. 5.3 Model Evaluation and Validation Experiments In this section, we validate our analytical results with the simulation results of TOSSIM [6] in both square structures and linear structures, which will be described below respectively. Square Structures: In order to validate our analytical results in square structures, we first compare the completion times using our analytical approach with that of simulation. We have not included the analysis in [4] for comparison as that analysis does not consider protocol details (such as advertisement and request overheads) and thus far

Performance of Bulk Data Dissemination in Wireless Sensor Networks

365

VLPXODWLRQUHVXOWV DQDO\WLFDOUHVXOWV

&RPSOHWLRQWLPHVHFRQGV

1RGHVSDFLQJIHHW

Fig. 6. Overall completion time comparisons in a 10x10 square structure with different node spacings

from accurate. For example, the completion time for the Blink application is nearly 4000 seconds even in a dense 10x10 square (i.e., PRR = 0.98) [4], which is not true in the simulation of TOSSIM. We use TOSSIM to generate the simulated completion time in 10x10 square with 5, 10, 15, and 20 feet node spacing respectively. We measure the completion time of disseminating the Blink application using Deluge. This Blink application has 4 pages in total when the page size is 24 packets per page. Fig 6 shows the comparison between the analytical results and the simulated results in 10x10 square with different node spacing. We can see from Fig 6 that the analytical results fit very well with the simulated results. Linear Structures: We compare our approach with Deluge’s approach as described in [1] in linear structures. We disseminate the Blink application segmented into 4 pages in 10x2 linear structures with 5, 10, 15, and 20 feet spacing respectively. Fig 7 plots the completion times for our approach, Deluge’s approach, and simulation. We can see that our approach fits better than Deluge’s approach, especially in the sparse case. This is because of two reasons. First, the number of request messages is fixed in Deluge’s approach. This is not true in realistic. Second, Deluge’s approach does not consider link quality per hop. Note that a poor quality link will dominate the propagation capacity along the path. 5.4 Computation Time In order to compute the completion times of N nodes, we need to run the Dijkstra algorithm to determine the propagation path from the source to each of the nodes (O(N 2 )); then we compute the completion times along the path for all the pages (O(NP)); we record the maximum completion times for a given node during this process. Thus, the overall complexity of our approach is, N · O(N 2 ) · O(NP) = O(N 4 P)

(11)

We implement our approach on IBM Thinkpad T43 in Perl, and compare the computation time of our approach with that of simulation time (in TOSSIM with the same

366

W. Dong et al.

VLPXODWLRQ RXUDQDO\VLVDSSURDFK 'HOXJH VDSSURDFK

&RPSOHWLRQWLPHVHFRQGV

1RGHVSDFLQJIHHW

Fig. 7. Overall completion time comparisons in a 10x2 linear structure with different node spacings Table 2. Computation time comparisons Topology Dissemination time (s) Simulated time (s) Analysis time (s) 2x2-5 24 5 0.001 5x5-5 46 79 0.016 8x8-5 102 569 0.14 10x10-5 266 3188 0.30 15x15-5 2098 21600 1.05

environment) and actual completion time for dissemination. Table 2 shows the results. We can see that our analytical approach is much faster than simulation, especially when the network scales. For example, for the 15x15-5 case, the simulation time is nearly 6 hours while our approach cost a few more than one second. This indicates that our approach can be applied to large-scale networks with very low time cost. 5.5 Performance Optimizations Performance optimization is an important design goal for WSNs [13,14]. In this section, we show how our model can aid protocol design for shortening the dissemination time. There are several factors affecting the overall completion time of the Deluge protocol, e.g., the page size. There is a tradeoff in determining the page size for dissemination. A large page size (which means small number of pages for a fixed sized image) will limit pipelining for spatial multiplexing while a small page size (which means large number of pages for a fixed size image) will add the per page negotiation overhead. With our analytical approach, we can easily evaluate the completion times with different page size. We illustrate the benefits of our approach through a 48x2-10 linear structure, disseminating the Blink application with 12 pkts/page, 24 pkts/page, and 48 pkts/page respectively. Fig 8 shows the completion times under different page sizes. Our analytical approach finds that the segmentation scheme with 24 pkts/page is optimal compared to the other two. The simulation results validate our analytical results, indicating that 24 pkts/page will reduce the completion time by 16.6% compared to the 48 pkts/page, the default page size in Deluge.

Performance of Bulk Data Dissemination in Wireless Sensor Networks

367

VLPXODWLRQ DQDO\VLV

&RPSOHWLRQWLPHVHFRQGV

SNWVSDJH

SNWVSDJH

SNWVSDJH

Fig. 8. Shortening the completion time by page size tuning

6 Related Work Wireless sensor networks have recently gained a great deal of attention as a topic of research, with a wide range of systems [8,15,16] and applications [17,18] being explored. Bulk data dissemination is a basic building block for sensor network applications. Deluge [1] is perhaps the most popular dissemination protocol used for reliable code updates in WSNs. It uses a three-way handshake and NACK-based protocol for reliability, and employs segmentation (into pages) and pipelining for spatial multiplexing. It is highly optimized and can achieve one ninth the maximum transmission rate of the radio supported under TinyOS. MNP [2] provides a detailed sender selection algorithm to choose a local source of the code which can satisfy the maximum number of nodes. There are other notable code dissemination protocols including Sprinkler [19], Freshet [20], Stream [4], etc. Generally, designers carry out simulations or testbed experiments to evaluate the performance these protocols. Various tools have been proposed in the literature to aid system design and evaluations [21,22,6]. However, testbed experiments requires at least a prototype system to be deployed and compared to prototyping, conducting offline analysis is inexpensive in terms of time and resources. Generally, simulators allow the designers to tune many different parameters and provide a fairly good resemblance of the real environment and compared to detailed simulation, theoretical analysis provides an alternate method of designing systems with lower cost. Although theoretical analysis of WSNs is a relatively new area, there is a growing interest and new types of analysis are continuously developed [23,24,25]. With respect to bulk data dissemination, there are a few preliminarily work on performance modeling and analysis. In [1], the authors only analyze the performance of Deluge in linear structures. In [4], the authors analyze the performance of code dissemination without consideration for protocol details, such as advertisement and request overheads. In [7], the authors propose a framework based on epidemic theory for vulnerability analysis of broadcast protocols in WSNs. They did not consider topological information and the impact of pipelining, thus the result is not accurate. Compared to [1], our work is more general and can be applied to general network topologies. Compared to [4,7], our work is much more accurate by considering topological information, impact of contention, and impact of pipelining. We validate the

368

W. Dong et al.

analytical results with extensive simulations via TOSSIM. We believe that the attractive benefits of theoretical analysis towards system design will foster the development of a much needed theoretical foundation in the field of WSNs.

7 Conclusions In this work, we show a way of analyzing mathematically the performance of bulk data dissemination protocols in WSNs. Our model can be applied to general networks by use of the shortest propagation path. Our model is accurate by considering topological information, impact of contention, and impact of pipelining. We validate the analytical results through detailed simulations, and we find the analytical results fit well with the simulation results. Further, we demonstrate that the analytical results can be used to aid protocol design for performance optimizations, e.g., page size tuning for shortening the completion time.

Acknowledgement This work is supported by the National Basic Research Program of China (973 Program) under grant No. 2006CB303000, and in part by an NSERC Discovery Grant under grant No. 341823-07.

References 1. Hui, J.W., Culler, D.: The dynamic behavior of a data dissemination protocol for network programming at scale. In: Proceedings of ACM SenSys (2004) 2. Kulkarni, S.S., Wang, L.: MNP: Multihop Network Reprogramming Service for Sensor Networks. In: Proceedings of IEEE ICDCS (2005) 3. Wang, Q., Zhu, Y., Cheng, L.: Reprogramming wireless sensor networks: Challenges and approaches. IEEE Network Magazine 20(3), 48–55 (2006) 4. Panta, R.K., Khalil, I., Bagchi, S.: Stream: Low Overhead Wireless Reprogramming for Sensor Networks. In: Proceedings of IEEE INFOCOM (2007) 5. Hagedorn, A., Starobinski, D., Trachtenberg, A.: Rateless Deluge: Over-the-Air Programming of Wireless Sensor Networks using Random Linear Codes. In: Proceedings of ACM/IEEE IPSN (2008) 6. Levis, P., Lee, N., Welsh, M., Culler, D.: TOSSIM: Accurate and Scalable Simulation of Entire TinyOS Applications. In: Proceedings of ACM SenSys (2003) 7. De, P., Liu, Y., Das, S.K.: An Epidemic Theoretic Framework for Evaluating Broadcast Protocols in Wireless Sensor Networks. In: Proceedings of IEEE MASS (2007) 8. TinyOS, http://www.tinyos.net 9. Kashyap, A., Ganguly, S., Das, S.R.: A Measurement-Based Approach to Modeling Link Capacity in 802.11-Based Wireless Networks. In: Proceedings of ACM MobiCom (2007) 10. Cormen, T.T., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms (1990) 11. Srinivasan, K., Levis, P.: RSSI is Under Appreciated. In: Proceedings of EmNets (2006) 12. Srinivasan, K., Kazandjieva, M.A., Agarwal, S., Levis, P.: The β -factor: Measuring Wireless Link Burstiness. In: Proceedings of ACM SenSys (2008)

Performance of Bulk Data Dissemination in Wireless Sensor Networks

369

13. Liu, X., Wang, Q., Sha, L., He, W.: Optimal QoS Sampling Frequency Assignment for RealTime Wireless Sensor Networks. In: Proceedings of IEEE RTSS (2003) 14. Shu, W., Liu, X., Gu, Z., Gopalakrishnan, S.: Optimal Sampling Rate Assignment with Dynamic Route Selection for Real-Time Wireless Sensor Networks. In: Proceedings of IEEE RTSS (2008) 15. Bhatti, S., Carlson, J., Dai, H., Deng, J., Rose, J., Sheth, A., Shucker, B., Gruenwald, C., Torgerson, A., Han, R.: MANTIS OS: An Embedded Multithreaded Operating System for Wireless Micro Sensor Platforms. ACM/Kluwer Mobile Networks and Applications Journal (MONET), Special Issue on Wireless Sensor Networks 10, 563–579 (2005) 16. Dong, W., Chen, C., Liu, X., Zheng, K., Chu, R., Bu, J.: FIT: A Flexible, LIght-Weight, and Real-Time Scheduling System for Wireless Sensor Platforms. In: Nikoletseas, S.E., Chlebus, B.S., Johnson, D.B., Krishnamachari, B. (eds.) DCOSS 2008. LNCS, vol. 5067, pp. 126–139. Springer, Heidelberg (2008) 17. Tolle, G., Polastre, J., Szewczyk, R., Culler, D., Turner, N., Tu, K., Burgess, S., Dawson, T., Buonadonna, P., Gay, D., Hong, W.: A Macroscope in the Redwoods. In: Proceedings of ACM SenSys (2005) 18. He, T., Krishnamurthy, S., Luo, L., Yan, T., Gu, L., Stoleru, R., Zhou, G., Cao, Q., Vicaire, P., Stankovic, J.A., Abdelzaher, T.F., Hui, J., Krogh, B.: VigilNet: An Integrated Sensor Network System for Energy-Efficient Surveillance. ACM Transactions on Sensor Networks (TOSN) 2(1), 1–38 (2006) 19. Naik, V., Arora, A., Sinha, P., Zhang, H.: Sprinkler: A Reliable and Energy Efficient Data Dissemination Service for Wireless Embedded Devices. In: Proceedings of IEEE RTSS (2005) 20. Krasniewski, M.D., Panta, R.K., Bagchi, S., Yang, C.L., Chappell, W.J.: Energy-efficient on-demand reprogramming of large-scale sensor networks. ACM Transactions on Sensor Networks (TOSN) 4(1), 1–38 (2008) 21. Werner-Allen, G., Swieskowski, P., Welsh, M.: MoteLab: A Wireless Sensor Network Testbed. In: Proceedings of ACM/IEEE IPSN/SPOTS (2005) 22. Titzer, B.L., Lee, D.K., Palsberg, J.: Avrora: Scalable Sensor Network Simulation with Precise Timing. In: Proceedings of ACM/IEEE IPSN (2005) 23. Prasad, V., Yan, T., Jayachandran, P., Li, Z., Son, S.H., Stankovic, J.A., Hansson, J., Abdelzaher, T.: ANDES: an ANalysis-based DEsign tool for wireless Sensor networks. In: Proceedings of IEEE RTSS (2007) 24. Cao, Q., Yan, T., Stankovic, J.A., Abdelzaher, T.F.: Analysis of Target Detection Performance for Wireless Sensor Networks. In: Proceedings of IEEE/ACM DCOSS (2005) 25. Ahn, J., Krishnamachari, B.: Performance of a Propagation Delay Tolerant ALOHA Protocol for Underwater Wireless Networks. In: Proceedings of IEEE/ACM DCOSS (2008)

Author Index

Abdelzaher, Tarek 44, 131 Agarwal, Pankaj K. 301 Ahmadi, Hossein 131 Akl, Selim G. 343 Alcock, Paul 329 Annavaram, Murali 273 Bachrach, Jonathan 15 Bar-Noy, Amotz 28, 245 Bathula, Manohar 216 Beal, Jacob 15 Bonnet, Philippe 72 Brown, Theodore 245 Bu, Jiajun 356 Chang, Marcus 72 Chen, Chun 356 Childers, Bruce 259 Dong, Wei

356

Ertin, Emre 287 Eswaran, Sharanya Ezra, Esther 301 Fleury, Eric

87

201

Gallagher, Jonathan 287 Ganjugunte, Shashidhara K. 301 Ghasemzadeh, Hassan 145 Ghosh, Sabyasachi 273 Gotschall, Joe 216 Guenterberg, Eric 145 Gupta, Rajesh 103 Guru, Siddeswara Mayura 187 Halgamuge, Saman 187 Han, Jiawei 131 Hazas, Mike 329 Huguenin, K´evin 201 Islam, Kamrul

343

Jafari, Roozbeh 145 Jin, Zhong-Yi 103 Johnson, Matthew P. 28, 87, 245

Kanj, Iyad A. 315 Kanso, Mohammad A. 173 Kaplan, Lance 28 Kermarrec, Anne-Marie 201 Khan, Mohammad Maiﬁ Hasan La Porta, Thomas 28, 87 Le, Hieu Khac 44 Li, Ming 273 Li, Weijia 259 Lin, Chun Lung 58 Ling, Hui 117 Liu, Ou 245 Liu, Xue 356 Liu, Yunhao 356 Loseu, Vitali 145 Maheswararajah, Suhinthan Meier, Dominic 1 Meijer, Henk 343 Misra, Archan 87 Mitra, Urbashi 273 Moses, Randolph 287 Narayanan, Shri Noh, Dong Kun

273 44

Patel, Nilesh 216 Pignolet, Yvonne Anne Pizzocaro, Diego 28 Pradhan, Ishu 216 Preece, Alun 28

1

Rabbat, Michael G. 173 Ramezanali, Mehrdad 216 Roedig, Utz 329 Rowaihy, Hosam 28 Rozgic, Viktor 273 Schmid, Stefan 1 Shu, Yanfeng 187 Silva, Agnelo R. 231 Spruijt-Metz, Donna 273 Sridhar, Nigamanth 216 Stankovic, John A. 159

187

131

372

Author Index

Terzis, Andreas Thatte, Gautam Tobenkin, Mark

72 273 15

Wattenhofer, Roger Wood, Anthony D. Xia, Ge

Vickery, Dan 15 Vuran, Mehmet C. Wang, Jia Shung Wang, Lili 44

231 58

315

Yang, Yong

44

Zhang, Fenghui 315 Zhang, Youtao 259 Znati, Taieb 117

1 159