4,212 954 23MB
Pages 495 Page size 336 x 540 pts Year 2009
Electric Power System Applications of Optimization
POWER ENGINEERING Series Editor
H. Lee Willis ABB Electric Systems Technology Institute Raleigh, North Carolina
1. Power Distribution Planning ReferenceBook, H. Lee Willis
2. Transmission Network Protection: Theoryand Practice, Y. G. Paifhankar 3. Electrical Insulation in Power Systems, N. H. Malik, A. A. A/-Aminy, and M. 1. Qureshi 4. Electrical Power Equipment Maintenanceand Testing, Paul Gill 5. ProtectiveRelaying:Principles and Applications,SecondEdition, J. Lewis Blackbum 6. Understanding Electric Utilities and De-Regulation, LomnPhilipsonand H. Lee Willis 7. Electrical Power Cable Engineering,William A. Thue 8. ElectricSystems,Dynamics,andStabilitywithArtificialIntelligence Applications, James A.Momoh and Mohamed E. El-Hawary 9. InsulationCoordination for Power Systems,Andrew R. Hileman I O . Distributed Power Generation: Planning and Evaluation, H. Lee Willis and Walter G. Scoff 11. Electric Power System Applicationsof Optimization,James A. Momoh
ADDITIONAL VOLUMES IN PREPARATION
Aging Power Delivery Infrastructures,H. Lee Willis, Gregory V. Welch, and Randall R. Schrieber
Electric Power System Applications of Optimization James A. Momoh Howard University Washington,D.C.
MARCEL
9%
MARCELDEKKER, INC.
NEWYORK BASEL
DEKKBR
-.
".
-
~_.".
~
"
" " " " " _ l "
-"
ISBN: 0-8247-9105-3
This book is printed on acid-free paper. Headquarters
Marcel Dekker, Inc. 270 Madison Avenue, New York, NY 10016 tel: 2 12-696-9000; fax: 2 12-685-4540 Eastern Hemisphere Distribution
Marcel Dekker AG Hutgasse 4, Postfach 8 12, CH-4001 Basel, Switzerland tel: 41-61-261-8482; fax: 41-61-261-8896 World Wide Web
http://www.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters address above. Copyright 0 2001 by Marcel Dekker, Inc. All Rights Reserved.
Neither this book nor any part may be reproduced or transmitted in any form or by anymeans,electronic or mechanical,includingphotocopying,microfilming,and recording, or by any information storage and retrieval system, without permission in writing from the publisher. Current printing (last digit): 10987654321 PRINTED IN THE UNITED STATES OF AMERICA
Series Introduction
Power engineering is the oldest and most traditional of the various areas within electrical engineering, yet no other facet of modern technology is currently undergoing a more dramatic revolution in both technology and industry structure. One of the more impressive areas of technical improvement over the past twenty years has been the emergence of powerful and practical numerical optimization methods for power-system engineering and operation, methods that ensure that the very best electrical and financial performance can be attained. The value contributed by optimization use in power systems is considerable, both in terms of economics-it has and will continue to save literally hundreds of millions of dollars annually-but also in terms of operational reliability and security. Systems run with optimization-based monitoring and controlreact better to both expected patterns in power demand and equipment availability and unexpected events such as storm damage and sudden equipment failure. But despite its potential for practical application and the tremendous value it can provide, optimization remains something of a “black art” to many power engineers, in part because there is no accessible reference that covers methodsandtheirapplication.This newest addition to Marcel Dekker’s Power Engineering Series, Electric Power System Applications of Optimization, meets this need. Certainly no one is more qualified to write such a book thanJames A. Momoh, who has long been associated with both productive research and rigorous, practical application of optimization to power-system planning, engineering, and operations. This book is an excellent text for both theexperienced power engineer whodesires a single consolidated reference on optimization, and to the Ill
iv
Introduction
Series
beginner who wishes to “ramp up” quickly so that he or she can use optimization to itsbest advantage. ElectricPower System Applications of Optimization provides a firm foundation and a uniform technical treatment of the different optimization methods available, their advantages and disadvantages, and the intricaciesinvolvedin their application to different power-system problems. The book is particularly easy to use, because it is thorough and uses a sound and consistent terminology and perspective throughout. This permits both novice and expert to understand and apply optimization on a dependable basis. The Power Engineering Series includesbooks covering the entire field of power engineering, in all of its specialties and sub-genres, all aimed at providing practicing power engineers with the knowledge and techniques theyneed to meet the electric industry’s challengesin the 21st century. Like all the books in the Marcel Dekker PowerEngineeringSeries, Momoh’s Electric Power System Applications of Optimization provides modern power technology in a context of proven, practical application. It is useful as a reference book as well as for self-study and advanced classroom use.
H. Lee Willis
Preface
Electric Power System Applications of Optimization is intended to introduce optimization, system theory, foundations of different mathematical programming techniques, and application to selected areas of electrical power engineering. The idea is to present theoretical background material from a practical power-system point of view and then proceed to explore applications of optimization techniques, new directions, and continuous application problems. The need for such a book stems from the extensive and diverse literature on topics of optimization methods in solving different classes of utility operations and planning problems. Optimization concepts and algorithmswere first introduced to powersystem dispatching, resource allocation, and planning in the mid-sixties in order to mathematically formalize decision-making with regard to the myriad of objectives subject to technical and nontechnical constraints. There hasbeen a phenomenal increase in research activities aimed at implementing dispatched, resource allocation problems and at planning optimally. This increase has been facilitated by several research projects (theoretical papers usually aimed at operation research communities) that promote usage of commercial programs for power-system problems but do not provide any relevant information for power engineers working on the development of power-system optimization algorithms. Mostrecently, there has been a tremendoussurge in research application with articles on how to apply optimization in electric power engineering. However, there is currently no book that serves as a practical guide to thefundamentalandapplication aspects of optimizationfor powersystem work. This book is intended to meet the needs of a diverse range V
vi
Preface
of groups interested in optimization application. They include university faculty, researchers, students, and developersofpowersystemswho are interested in or who plan to use optimization as a tool for planning and operation. The focus of the book isexclusively on thedevelopment of optimization methods, foundations, and algorithms and their application to power systems. The focus was based on the following factors. First, good references that survey optimization techniques for planning and operation are currently available but they do not detail theoretical formulation in one complete environment. Second, optimization analysis has become so complex that examples whichdeal with non-power system problems are only studied and many issues are covered by only a few references for the utility industry. Finally, in the last decade, new optimization technologies such as interior point methods and genetic algorithms have been successfully introduced to deal with issues of computations and have been applied to new areas in power system planning and operation. The subject matter presented in this book provides both the analytical formulation of optimization and various algorithmic issues that arise in the application of various methods in power-systemplanning and operation. Tn addition, the book provides a blend of theoretical approach andapplication based on simulation. In Chapter 2, we revise electric power-system models, power-system component modeling, reactive capabilities, ATC, and AGC. The chapter concludes with illustrated examples. In Chapter 3, we introduce the theoretical concepts and algorithms for power-flow computation using different numerical methods with illustrative examples and applications for practical simulation studies. To treat the problem of optimization in one easy,concise form, Chapter 4 deals withclassical unconstrained and constrained techniques withsimple applications to powersystems. This chapter concludes with illustrative examples. Chapter 5 is dedicated to linear programming theory, methods, and its extension to integer programming, postoptimization (sensitivity analysis) and its application to power systems, withillustrative examples. Chapter 6 deals with new trends in optimization theory such as interior point optimization methods for both linear and quadratic formulation. The chapter includes examplesand applications to power systems. In Chapter 7, we discuss non-linear programming technique and its extension to recent interior point methods such as barrier methods. The computational algorithm for each of the nonlinear programming variants is presented. Chapter 8 presents the dynamic programming optimization algorithm with illustrative examples. In Chapter 9, the Lagrangian relaxation concept and algorithm are discussed. Their applicability to unit commitment and
I
Preface
vii
resource allocation is described. In Chapter 10, the decomposition method for solving large-scale optimization problems is presented. Illustrative examples are given following the procedure. In Chapter 1 1, optimal power flow, modeling, and selected programming techniques derived from earlier chapters are used for solvingdifficultobjective functions with constraints in power-system operation and planning. Illustrative examples are included, Chapter 12 addresses unit commitment concepts, formulation, and algorithms. Examples and applications to power-systems dispatching are presented. In Chapter 13, genetic algorithms (GA) are presented as tools for optimization. I discuss the definition of GA computation, approach, and algorithm. Application areas of genetic algorithms as a computational tool in power-system operation and planning are described. It is hoped that the application areasdiscussed in this book will offer the reader an overview of classical optimization method without sacrificing the rudimentsof the theory. Those working in the field or willing to engage in optimal power flowwill find the material useful and interesting as a reference or as agood starting pointto engage in power-system optimization studies. A significant portion of the material presented in the book is derived from sponsored projects, professional society meetings, panel sessions, and popular texts in operation research in which I have had personal involvement. These include research and development efforts, which were generally supported by funding agencies suchas theElectric Power Research Institute (EPRI), the NationalScience Foundation (NSF), and HowardUniversity. I wish to acknowledge the significant contribution made by the engineers of Bonneville Power Authority (BPA), Commonwealth Edison (ComEd), and the Departmentof Energy (DOE) in the development and testing of optimal power flow using variants of optimization techniques such as genetic algorithms and interior point methods. This book would not have beenpossible withoutthe help of the students in the optimization and power-system group at Howard University and CESaC research staff who provided dedicated support in optimal power-flow algorithm testing, in problem-solving, and in the task of preparing this book for publication. I remain in debt to my colleagues for their keen interest in the development of this book, specifically, ProfessorKenneth Feglyof the University of Pennsylvania, Professor Bruce Wollenberg of the University of Minnesota, Professor Emeritus Hua Ting Chieh of Howard University, and Professor Mohammed ElHawary of Dalhousie University, who offered valuable criticism of the book during the preparation stage. Finally, I wish to thank Mrs. Lee Mitchell of Howard University for proofreading and the admirable students in the CESaCfamily for helping to
vlll
Preface
type. Finally, I offer my deepest personal thanksto those closest to me who have provided support during the time-consuming process of writing this book.
James A . Monzoh
Contents
Series Introduction Preface
H. Lee Willis
1. Introduction I.
Structure of a Generic Electric Power System
TI. Power System Models 111. Power System Control
IV. Power System Security Assessment V. Power System Optimization as a Function of Time VI. Review of Optimization Techniques Applicable to Power Systems References
2. Electric Power System Models I. Introduction Complex Power Concepts Three-Phase Systems Per Unit Representation Synchronous Machine Modeling VI. Reactive Capability Limits VII. Prime Movers and Governing Systems VI11. Automatic Gain Control IX. Transmission Subsystems 11. 111. IV. V.
iii V
1
1 3 5 9 12 14
17 19
19 20 22 29 30 32 33 36 43 lx
Contents
X
X.Y-Bus Incorporating the Transformer Effect XI. Load Models XII. Available Transfer Capability (ATC) XIIT. IllustrativeExamples XIV. Conclusions XV.ProblemSet References
3. Power-Flow Computations
4.
5.
"_
65
I. Introduction IT. Types of Buses for Power-Flow Studies 111. General Form of the Power-Flow Equations IV.PracticalModeling Considerations V. Iterative Techniques for Power-FlowSolution VI. PracticalApplicationsofPower-FlowStudies VII.IllustrativeExamples VIII. Conclusion IX. ProblemSet References
65 66 68 71 75 106 107 112 113 117
Constrained Optimization and Applications
119
I. Introduction 11. Theorems on the Optimization of Constrained Functions 111. Procedure for the Optimizing Constrained Problems (Functions) IV.IllustrativeProblems V. PowerSystemsApplicationExamples VI.IllustrativeExamples VII. Conclusion VIII. ProblemSet References
119
121 122 124 133 138 139 141
Linear Programming and Applications
143
I. Introduction 11. Mathematical Model and Nomenclature in Linear Programming 111. Linear Programming SolutionTechniques IV.DualityinLinear Programming V. MixedInteger Programming
I
45 52 54 57 60 60 63
119
143 143 146 161 163
ntroduction
Contents
190
VI.Sensitivity Methods for Postoptimization in Linear Programming VII. Power Systems Applications VIII. Illustrative Examples IX. Conclusion X. Problem Set References
6. InteriorPointMethods
I. 11. 111. IV. V. VI. VTI. VIII. IX.
Introduction Karmarkar’s Algorithm The Projection Scaling Method TheDual Affine Algorithm The Primal Affine Algorithm The Barrier Algorithm Extended Interior Point Method for LP Problems Feasible Interior Sequence Extended Quadratic Programming Using Interior Point (EQIP) Method X. Illustrative Examples XI. Conclusions XII. Problem Set References
7. NonlinearProgramming Introduction I. 11. Classificationof NLP Problems 111. Sensitivity Method for Solving NLP Variables IV. Algorithm forQuadratic Optimization V. Barrier Method for Solving NLP VI. Illustrative Examples VII. Conclusion VIII. Problem Set References
8. Dynamic Programming I. 11. Formulation of Multistage Decision Process
xi
170 182 183 190 195
197 197 199 20 1 202 203 205 206 207 21 1 214 225 225 226 229
229 230 23 1 235 237 243 250 250 254
257 257 258
Contents
xii
9.
111. Characteristics of Dynamic Programming IV. Concept of Suboptimization and the Principle of Optimality V. Formulation of Dynamic Programming VI.Backward andForward Recursion VII. ComputationalProcedure in Dynamic Programming VIII. Computational Economyin DP IX. Systemswith More than One Constraint X. Conversion of a Final Value Problem into an Initial Value Problem XI. Illustrative Examples XII. Conclusions XIII. Problem Set References
260
Lagrangian Relaxation
293
I. Introduction 11. Concepts 111. The Subgradient Method for Setting the Dual Variables IV. Setting Tk V. Comparison withLinearProgramming-Based Bounds VI. An Improved Relaxation VII. Summaryof Concepts VIII. Past Applications IX. Summary X. Illustrative Examples XI. Concfusions XII. Problem Set References
29 3 294
10. Decomposition Method
I. TI. 111. IV.
Introduction Formulation of the Decomposition Problem 326 Decomposition Algorithm of Technique 328 Illustrative Exampleof the Decomposition Technique V. Conclusions
336
I
26 1 263 268 278 279 279 282 282 287 288 29 1
295 302 307 309 310 31 1 313 320 32 1 322 323 325
325
329
Contents
VI. Problem Set References 11.OptimalPowerFlow Introduction I. 11. OPF-Fuel Cost Minimization 111. OPF-Active Power Loss Minimization IV. OPF-VAr Planning V. OPF-Adding EnvironmentalConstraints VI. Commonly Used Optimization Technique (LP) VII. Commonly Used Optimization Technique (NLP) VIII.Illustrative Examples IX. Conclusions X. Problem Set References 12.UnitCommitment I. Introduction 11. Formulation of Unit Commitment 111. Optimization Methods IV. Illustrative Example V. Updating A,(t) in the Unit Commitment Problem VI. Unit Commitment of Thermal Units Using Dynamic Programming VIT. Illustrative Problems VIII. Conclusions IX. Problem Set References
13. GeneticAlgorithms Introduction I. 11. Definition and Concepts Usedin Genetic Computation 111. Genetic Algorithm Approach IV. Theory of Genetic Algorithms V. The Schemata Theorem VI. General Algorithm of Genetic Algorithms VII. Application of Genetic Algorithms VIII. Application to PowerSystems IX.Illustrative Examples
xiii
336 338 339 339 340 344 349 358 360 373 387 394 395 397 401 40 1 403 406 410 422 425 434 434 436 44 1 443 443 444 446 449
452 454 455 457 469
Contents
xlv
X. Conclusions XI. ProblemSet References
47 1 47 1 472
Epilog
473
Index
475
Chapter 1 Introduction
1.
STRUCTURE OF A GENERIC ELECTRIC POWER SYSTEM
While no two electric powersystems are alike, all share some common fundamental characteristicsincluding the following. 1.
Electric power is generated using synchronous machines that are driven by turbines (steam, hydraulic, diesel, or internal combustion). 2. Generated power is transmitted from the generating sites over long distances to load centers that are spread over wide areas. 3. Three-phase AC systems comprise the main means of generation, transmission, and distribution of electric power. 4. Voltage and frequency levels are required to remain within tight tolerance levels to ensure a high quality product The basic elements of a generic electric power systemare displayed in Figure 1.1, Electric power is produced at generating stations (GS) and transmitted to consumers through an intricate network of apparatus including transmission lines, transformers, and switching devices. A transmission network is classified as:
1. transmission system, 2. subtransmission system, and 3. distribution system. 1
Chapter 1
2
Q
22kV
500 kV -
20 kV Tie line to neighboring system
1 1
I I
I
I
Trans- I Transmission system mission line Tie(230 kV) System 230 kV (5ykV)
Tie line L
r (r ,' I
I I To subtransmission and
Transmission Substation
-- 115 kV """_""""""""""""""""-""""
system
345 kV
Bulk Power
Distribution
1
"-
Subtransmission Industrial customer
industrial
Subtransmission and Distribution
System Customer
115 kV
"
3-phase primary feeder Small GS
,
u0/240 V Single-phase feeder Secondary
,
Commercial Residential
FIGURE 1.1 Basic elements qf a power system.
The transmission system interconnects all major generating stations and main load centers in the system. It forms the backbone of the integrated power system and operates at the highest voltage levels (typically, 230 kV and above). The generator voltages are usually in the range of 11 kV to 35 kV. These are stepped up to the transmission voltage level, and power is transmitted to transmission substations wherethevoltages are stepped down to the subtransmission level (typically, 69 kV to 138 kV). The generation and transmission subsystems are often referred to as the bulk power system.
.d
i
Introduction
3
The subtransmission system transmits power at a lower voltage and in smaller quantities from the transmission substation to the distribution substations. Large industrial customers are commonly supplied directly from the subtransmission system. In some systems, their expansion and higher voltage levels becoming necessary for transmission, the older transmission lines are often relegated to subtransmission function. The distribution system is the final stage in the transferof power to the individual customers. The primary distributionvoltage is typically between 4.0 and 34.5 kV. Primary feeders at this voltage level supply small industrial customers. The secondary distribution feeders supply residential and commercial customers at 120/240 V.
II. POWER SYSTEM MODELS In order to be able to control the power system from the point of view of security, one is required to know two basic things: the mathematicalmodels of the system and the variety of control functions andassociated objectives that are used. In this section some general remarks about power system models are given. In Figure 1.2 we show the basic decomposition of the system into a set of generators that deliver electrical powerto the distributed load by means of a network. In our subsequentdiscussion we start by describing the load,then the network, and finally the generators. In assessing load behavior as seen from a substation, one is interested typically in items such as the following. Present value of real power consumed in MW and the associated power factor (or reactive power). Forecast values of real and reactive power over a range of future times-next few minutes, to days and years. Load response characteristics (e.g., lumped circuit representation or transferfunction)forfluctuations in substation voltage and frequency.
By knowing the real and reactive components of substation loads (present or forecast), one can establish a complete picture for a steadystate bulk power demand in the system. Furthermore, by identifying load response characteristics one can adequately evaluate thedynamic behavior of the demand in the presence of disturbances and/or feedback controllers. The network portion of the system consists of the transmission, subtransmission, and distribution networks. This division is based on voltage levels, and, consequently, on the ratingsof various circuits. Typically, power
Chapter 1
4
I Frequency flows Tie Gererator
I
I I
System Generation
Control
\
Loadfrequencycontrol Supplementary
+
Control
Othergeneratingunitsand associated controls A
r"""""""""""""""""~ I
I
Excitation system and control
I I I I I
-b
Voltage
I I
I
I
4-
Speed/
I I
Power ;I
I I 1 1 I
v Prime mover
I
I I
I I I
,
!I I I I
I 1 I I
S h q
'
v
Generator
Power Speed
I I
'
I
Transmission Controls
Reactive power and voltage wectrical control, HVDC transmispower Tie
sion and associated controls
I
! I I
Frequency I
-
t lows I
Generator power
"""""""""""""""""-I
+ ? I
FIGURE 1.2 Subsystems of a power system and associated controls.
transmission is done at voltages that can range from 115 to 765 kV. The transmission network is not necessarily radial. In fact, it has many closed loops as required for reliable supply purposes. Subtransmission (65-40 kV) and distribution (20-1 15 kV) systems are primarily radial when operated. Because ofthis arrangement, analysis at the bulk power levelsconsiders only the transmission portion of the network. From a transmission substation, real and reactive power will flow radially to the load through a sequence of step-down transformers and power lines. In modeling the network elements, one should identify the type of problembeing analyzed. Under normal conditions, the load fluctuates very slowly compared to transient time constants associated with transmission lines. And since system frequency is maintained at its nominal value
Introduction
5
quite accurately, the lumped circuit representation of transmission lines is quite adequate. On the other hand, if electromagnetic transients resulting, for example, from lightning strikes, are being investigated, then wave equations should be considered. In our present context, thelumped circuit representation can be very adequate. In ablock diagram of a typical generator, the blocks correspond to the main components of the power plant. The significant outputs of the generator as measured at the terminals are its: Voltage magnitude (in kV), Real power produced (in MW), Reactive power produced (in MVAr), and Speed (in radians and denoted by wA2n-f, where f is the frequency). Under normal conditions, three of those quantities are continuously controlled by the power plant. These are theterminal voltage, the frequency (speed), and real power output. The outputvoltage Vis subtracted from the specified voltage Vo and the difference is an error signal that drives the exciter system. The exciter, in turn, modifies the field voltage of the turboalternator in such a way that V becomes closer in value to v 0. The same feedback concept applies to the control of frequency and real power. In this case, however, the corresponding error signal drives the governor system. The governor controls valve openings of the turbine, which in turn control its speed and mechanical power output. The power error signal can also go back to the prime mover (boiler, in the case of steam generation), so that more, or less, steam is generated. The exciter normally has a fast response (10-2-10-' sec). The governor-turbine system is slower (.1-1 sec) in its response. However, since the load is much slower in its changes than these response times, it is safe to assume that perfect control is alwayspresent; that is, normal operationis, to a high degree, sinusoidal steady-state operation. Only when events that are fast relative to governor-turbine orexciter response times, would one worry about the steady-state operation. Thus, in the presence of network faults, or immediately following switching operations, one need consider transient or dynamic representation of the system. 111.
POWERSYSTEMCONTROL
Before discussing how the system is controlled, one needs to briefly summarize the means by which control action is obtained. First, let us understand the meaning of control as it applies in the power system case. The system is normally designed so that certain quantities can be manipulated by means of devices. Some of these are so-called
6
Chapter 1
status quantities. By means of power circuit breakers (PCBs), a transmission line is open (status = OFF) or closed (status = ON). Some of them are integer variables (tap-settings on power transformers). And the rest are continuous variables such as the real power output of a generator. The control devicescanbe simple, like fuses, or highlycomplexdynamic systems, like exciters and governors. Control action is attained by the manipulation of all control devices that exist on the system. This is achieved in order to meet different, but consistent, objectives, and through a variety of means. The general objectives of system control are listed in order as follows.
1. Protection of major pieces of equipment and of system integrity 2. Continuityof high-quality service 3. System secure operation 4. System economic and environmentally acceptable operation 5 . Emergency statecontrol 6. Restorative control inminimumtime As a rule, control action is based on information derived from direct measurements and/or inferred data. Each control device will require certain kinds of information based on the following considerations. Speed-of-responserequirements Impact of control action (i.e., global vs. local) Relative importance of different pieces of information (e.g., local vs. distant information) Some examples of this are now in order. For a short circuit fault on a transmission line, the main objective is to protect the system from going unstable (losing synchronism) and to protect the line from serious damage. This is achieved by correct breaker action, which will open that line and isolate it from the system. Normally, however, other neighboring lines and transformers feel the effect of the short circuit. Hence, it is important to open the faulted line first. By means of offline short circuit analysis relay settings are established so that the faulted linewill open first. Hence, the onlyneeded online information for that purpose isline current. Thisis strictly local information. In a more complicated situation, we can look at the problem of maintaining a satisfactory voltage profilein the system. Scheduled generator terminal voltage is attained by means of local feedback control to the exciter. The values of scheduled voltages, which are the set-points in the feedback loop, are established from an analysis of the entire system’s operating conditions. In mostcases,offline analysis of the systemyieldsvaluesof scheduled voltages. Modern, energy control centers (ECC) have the capabil-
4
i
Introduction
7
ity of processingglobal online information and updatingvoltage profile setpoints. The functionof an electric power systemis to convert energy from one of the naturally available forms to the electrical form and to transport it to the pointsof consumption. Energy is seldomconsumed in the electrical form but is rather converted to other forms such as heat, light, and mechanical energy. The advantage of the electrical form of energy is that it can be transported and controlled with relative ease and with a high degree of efficiency and reliability. A properly designed and operated power system should, therefore, meet the following fundamental requirements. 1. The system must be able to meet the continually changing load demand for active and reactive power. Unlike other types of energy, electricity cannot be conveniently stored insufficient quantities.Therefore,adequate “spinning” reserveof active and reactive power should be maintained and appropriately controlled at all times. 2. The system should supply energy at minimum cost and with minimum ecological impact. 3. The “quality” of the power supply must meet certain minimum standards with regard to the factors: a. constancy of frequency, b. constancy of voltage; and c.level of reliability. Several levels ofcontrols involving a complex array of devices are used to meet the above requirements. These are depicted in Figure 1.2, which identifies the various subsystems of a power system and the associated controls. In this overall structure, there are controllers operating directly on individual system elements. In a generating unit these consist of prime mover controls and excitation controls. The prime mover controls are concerned with speed regulation and controlof energy supply system variables such as boiler pressures, temperatures, and flows. The function of the excitation control is to regulate generator voltage and reactive power output. The desired MW outputs of the individual generating units are determined by the system generation control. The primary purposeof the system generation control is to balance the total system generation against system load and losses so that the desired frequency and power interchange with neighboring systems (tie flows)is maintained. The transmission controls include power and voltage control devices, such as static VAr compensators, synchronous condensers, switched capa-
8
Chapter 1
citors and reactors, tap-changing transformers, phase-shifting transformers, and HVDC transmission controls. These controls described above contribute to the satisfactory operation of the power system by maintaining system voltagesand frequency and other system variables within their acceptable limits. They also have a profound effect on the dynamic performance of the power system and on its ability to cope with disturbances. The control objectives are dependent on the operating state of the power system. Under normal conditions, the control objective is to operate as efficiently as possiblewithvoltages and frequencyclose to nominal values.When an abnormal condition develops, new objectives must be met to restore the system to normal operation. Major system failures are rarely the result of a single catastrophic disturbance causing collapse of an apparently secure system. Such failures are usually brought about by a combination of circumstances that stress the network beyond its capability. Severe natural disturbances (such as a tornado, severe storm, or freezing rain), equipment malfunction, human error, and inadequate design combine to weaken the power system and eventually lead to its breakdown. This may result in cascading outages that must be contained within a small part of the system if a major blackout is to be prevented. Protecting isolated systems has been a relatively simple task, which is carried out using overcurrent directional relayswithselectivitybeing obtained by time grading. High-speed relays have been developed to meet the increased short-circuit currents due to the larger size units and the complex interconnections. For reliable service, an electric power system must remain intact and be capable of withstanding a wide variety ofdisturbances. It is essential that the systembe operated so that the more probable contingencies can be sustained without loss ofload (except that connected to the faulted element) and so that the most adverse possible contingencies do not result in widespread and cascading power interruptions. The November, 1965 blackout in the northeastern part of the United States and Ontario had a profound impact on the electric utility industry. Many questions wereraised and led to the formation of the National Electric Reliability Council in 1968. The name was later changed to the North AmericanElectricReliability Council (NERC).Its purpose is to augment the reliability and adequacy of bulk power supply inthe electricity systems of North America. The NERC is composed of nine regional reliability councils and encompasses virtually allthe power systems in the United States and Canada. Each regional council has established reliability criteria for system design and operation. Since differences exist in geography, load
lntroductlon
9
pattern, and power sources, criteria for the various regions differ to some extent. Design andoperatingcriteria play an essential role in preventing major system disturbances following severe contingencies. The use of criteria ensures that, forall frequently occurring contingencies, the system will, at worst, transit from the normal state to the alert state, rather than to a more severe state such as theemergency state or thein extremis state. When the alert stateis entered following a contingency, operators can take actions to return the system to the normal state.
IV. POWERSYSTEMSECURITYASSESSMENT The term power system securityis used to mean the ability of the bulk power electric powersystem to withstand sudden disturbances such as electric short circuits or unanticipated loss of system components. In terms of the requirements for the proper planning and operation of the power system, it means that following the occurrence of a sudden disturbance, the power system will: 1.
2.
survive the ensuing transient and move into anacceptable steadystate condition, and in this new steady-state condition, all power system components operate within established limits.
Electric utilities require security analysis to ensure that, for a defined set of contingencies, the above requirements are met. The analysis required to survive a transient is complex, because of increased system size, greater dependence on controls, and more interconnections. Additionalcomplicating factors include the operation of the interconnected system with greater interdependence among its member systems, heavier transmission loading, and concentration of the generation among few large units at light loads. After the 1965 blackout, various efforts went into improving reliable system operation. Several reliability criteria and emergency guidelines were introduced by the Federal Power Commission (FPC) and NAPSIC (North American Power System Interconnection Committee). Summaries of these guidelines are given in [2, Appendix]. These guidelinesand criteriarepresent efforts by the government and the utilities to improve control and operational practices. More important, however, were the effortsby various researchers and specialists in what has come to be known as the secure control of the power system. In DyLiacco’s pioneering work [2], the power system is judged to
10
Chapter 1
reside at any instant of time in any of three operating states: normal, emergency, and restorative. Under normal steady-state operating conditions all customer demands are met and all equipment is operating below its rated capacity. Theoretically speaking, the requirement of meeting customer demands is expressed mathematically by means of a set of equations (equality constraints) of the type:
h,(x1, . . .,x,; 211,
. . . ,u,)
= 0,
where XI, .. ., x, are a set of dependent (state) and u I , . . . , un, a set of independent (input, demand, or control)variables. Typically these equality constraints correspond to the so-calledload-flow equations. The constraints relative to equipment can be written, in general, in the following form.
They correspond to items such as upper and lower limits on power generation by a given unit, current limits on transmission lines and transformers, and so on. Mathematically, the normal operating state isdefinedwhenever the utility system considered satisfies relations 1.1 and 1.2. Following certain disturbance events (short-circuits due to faults, loss of generation, loss of load, and others) some of the inequality constraints may be violated. For example, a line may become overloaded, or system frequency may drop below a certain limit. In these cases the system is in the emergency operating state. Finally, the system may exist ina situation where onlya fraction of the customers are satisfied without overloading any equipment. In this case only a portionof the system is inthe normal state. As a result, not all the equality constraints are satisfied, but the inequality constraints are. Such a state is called the restorative operating state. Symbolically, we can rewrite equations 1.1 and 1.2 in the following form.
introduction
11
With this notation, we summarize our definition of the three operating states as follows: Normal State:
Emergency State:
Restorative State:
The security of the system is definedrelative to its ability to withstand a number of postulated disturbances.A normal state is said to be secure if, following any one of the postulated disturbances, thesystem remains in the normal state. Otherwise, it is insecure. In the online operation of the system, one monitors thedifferent variables that correspondto its operating conditions. This monitoringprocess is called security monitoring. The process of determining whether the system is in the secure normal state is called security assessment. In the process of security assessment it maybe concluded that the system is in the insecure normal state. In that case, the system operator will attempt to manipulate system variables so that the system is secure. This actionis calledpreventive control. If, on the other hand, the system is in an emergency state, then two types of control action arepossible. In the first type, called corrective control, action is possible whereby the system is sent back to the normal state. If corrective control is not possible, then emergency control is applied. This control can be due to relay-initiated action, automatic control, or operator control. In any case, the system will drift to the restorative stateas a result of emergency control. Finally,in the restorative state,controlaction is initiated to restore all services by means of restorative control. This should put the system back in the normal state. Figure 1.3 illustrates the various transitions due to disturbances as well as various control actions.
Chapter 1
12 Normal State
FIGURE 1.3 Operating states of a power system.
V.
POWER SYSTEM OPTIMIZATION AS A FUNCTION OF TIME
The hourly commitment of units, the decision whether a unit is on oroff at a given hour, is referred to as unit commitment. Hourly production of hydroelectric plants based on the flexibility of beingable to manage water reserve levels to improve system performance is referred to as the hydrothermal problem and hourly production of coal generation or a dual purpose plant is called the dual purpose problem. Scheduling of unit maintenance without violating reserve capacity while minimizing theproduction cost is referred to asa maintenance scheduling problem. The interdependence among the various control optimization problems as the time horizon expands from seconds to years is shown in Figure 1.4. In power system operation and planning, there are many optimization problems that require real-time solutions such that one can determine the
4
Introduction
Time Horizon Seconds_
Minutes-
13
Process: Control Automatic Generation Control (AGC) Optimal Load
Optimization Function Minimize Area Control Error
* Subject to Machine and System Dynamics Constraints.
-+
Minimize Instantaneous Cost of Operation or Other Indices e.g. Pollution.
*
Minimize Expected Costof Operation or Other Indices.
Flow (OLF)
Hours-
Unit Commitment Hydrothermal Dual Problem
Days_
Hydrothermal Dual Problem
Weeks-
Months-
Years-
Hydrothermal Interchange Coordination Maintenance Scheduling, Interchange Coordination
"+
Minimize Expected Costof Operation
*
Minimize Expected Costof Operation with Reliability Constraints
-b
Minimize Expected Costof Operation with Reliability Constraints
H
Minimize Expected Investment Maintenance Scheduling, Generation and Operational Costs with ReliConstraints ability Planning
FIGURE 1.4 Time horizon of the power system optimization problem.
optimal resourcesrequired at minimumcostwithinagivensetof constraints. This scheduling is done over time (minutes, hours, days, etc). In this regard, we classify the problem as either operational or planning. Notably, in the operations scheduling problem, we usually extend the studies up to24 hours. On the other hand,planning problems are solved in the time frame of years. In analyzing the optimization problem, there are many controllable parameters of interests. There are many objective functions and constraints that must be satisfied for economic operation. (These objectives and con-
Chapter 1
14
straintsare quantified later.) Methods existing for solving the resulting economic dispatch problem as a function of time when we incorporate the constraints of the system and typically the economic dispatch problem evolves. It uses mathematical techniquessuch as linearprogramming (LP), unconstrained optimization techniques (using Lagrange multipliers), and nonlinear programming (NLP) to accommodate the constraints. The availability of these techniques in addressing this problem has been noted. Other variations on the economic dispatch problem are hydrothermal and unit commitment problems. Dynamicprogramming (DP), Lagrange relaxation technique, and Bender’s decomposition algorithm are used to solve this class of optimization problem. Another method in power systemoperation and controlis the optimal maintenance of units and generators. Finally, in the same realm is optimal power flow (OPF), which holds the promise of extending economicdispatch to include the optimal setting of under load tap-changers (ULTCs), generator real and reactivepowers, phase-shifter taps, and the like. Optimal power flow has been expanded as new problems arise to include new objective functions and constraints. And optimal power flow has attracted researchers to the development of new optimization algorithms and tests as a routine base. Other applications extending the work to optimization of the network include VAr planning, network expansion, and availability transfer capability. At the distribution end, loss minimization, data estimation, and network reconfiguration have demanded optimum decision making as a planning problem as well as an operations problem. There are mathematical optimization techniques ranging from linear programming to evolutionary searchtechniques that can beemployed to obtain optimum distribution networks. There is a need to summarize the essential mathematical methods that have been fully developed, tested,and utilized on a routine basis for security analysis of the power system. The selection of the appropriate optimization technique depends on the system as defined by the objective functions and the constraints. The constraints are divided into two classes, namely, technical and nontechnical. The class of technical constraints includes network, equipment, and device constraints. The class of nontechnical constraints includes social, environmental, and economic limitations. VI.
REVIEW OF OPTIMIZATIONTECHNIQUES APPLICABLE TO POWER SYSTEMS
In the early days of power system operation, the OPF tool was defined in terms of the conventional economic dispatch problem aimed at determining
4
t
Introduction
15
the optimal settings for control variables in a power system with respect to various constraints. However, the capability of power system optimization tools has been broadened to provide solutions for a wide range of utilitydependent problems. Today, the optimal power flow tool is used to solve a static constrained nonlinear optimization problem whose development has followedclosely the advances in numerical optimization techniques and computer technology. Commonly available optimal power flow packages can solve very large and complex power system formulations in a relatively short time. Generally, optimal power flow requires solving a set of nonlinear equations, describing optimal and/or secure operation of a power expressed as: Minimize while satisfying
F(x, u) g(x, u) = 0 h(x, u) I0,
where g(x, u): Set of nonlinear equality constraints (power flow equations); h(x, u): Set of inequality constraints of vector arguments x and u; X: Vector of dependent variables consisting of bus voltage magnitudes and phase angles, as well as MVAr loads, fixed bus voltages, line parameters, andso on; U: a Vector of control variables. The vector u includes the following. Real and reactive power generation; Phase-shifter angles; Net interchange; Load MW and MVAr (Load Shedding); DC Transmission line flows; Control voltage settings; Load tap changer (LTC) transformer tap settings. Common objectives in a power system include: Active power cost minimization; Active power loss minimization; Minimum control shift; Minimum number of controls scheduled.
Chapter 1
16
And examples of the associated equality and inequality constraints are: Limits on all control variables; Power flow equations; Generation/load balance; Branch flow limits; Bus voltage limits; Active and reactive reserve limits; Generator MVAr limits; Corridor (transmission interface) limits. The optimization methods that are incorporated in the optimal power flow tools can be classified based on optimization techniques such as: Linear programming (LP) based methods; Nonlinear programming (NLP) based methods; Integer programming (IP) based methods; Separable programming (SP) methods. Notably, linear programming is recognized as a reliable and robust technique for solving a wide range ofspecialized optimization problems characterized by linear objectives and linear constraints. Many commercially available power system optimization packages contain powerful linear programming algorithms for solving power systemproblems for both planning and operator engineers. Linear programming has extensionsin the simplex method, revised simplex method, and interior point techniques. Interior point techniques are based on the Karmarkar algorithm and encompass variants such as the projection scaling method, dual affine method, primal affine method, and barrier algorithm. In the case of the nonlinear programming optimization methods, the following techniques are introduced. Sequential quadratic programming (SEQ); Augmented Lagrangian methods; Generalized reduced gradient method; Projected augmented Lagrangian; Successive linear programming; Interior point methods. The basic formulation is then extended to include security and environmental constraints, which have become very important factors in power system operation in the past few decades. Special decomposition strategies are also applied in solvinglarge-scalesystem problems. TheseincludeBenders
1
e
.
I
Introduction
17
decomposition, Lagrangian relaxation, and Talukdar-Giras optimization techniques. In recent years, the advancement of computer engineering and the increased complexity of the power system optimization problem have led to greater need for and application of specialized programming techniques for large-scale problems. These include dynamic programming, Lagrange multiplier methods, and evolutionary computation methods such as genetic algorithms. These techniques are often hybridized with many other techniques of intelligentsystems, including artificial neural networks (ANN), expert systems (ES), tabu-search algorithms, and fuzzy logic (FL).
REFERENCES 1. Bergen, A.R. andVittal, V. Power SystemsAnalysis, Prentice-Hall,2nd ed., 1999. 2. Elgerd, 0.I. Electrical Energy Systems Theory-An Introduction, McGraw-Hill, New York, 1982. 3. Gonen, T. ElectricPowerDistributionSystemEngineering, McGraw-Hill, New York, 1986. 4. Luenberger, D. G. Introduction to Linear and Nonlinear Programming, AddisonWesley, Reading, MA, 1975. 5. Wood, A. J. andWollenberg, B. F. PowerGeneration,Operation,andControl, Wiley, New York, ed., 1996.
This Page Intentionally Left Blank
Chapter 2 Electric Power System Models
1.
INTRODUCTION
The power industry in the United States has been engaged in a changing business environment for quite some time now, moving away from a centrally planned system to one inwhich players operate in a decentralized fashion with little knowledge of the full state of the network, and where decision-making is likely to be market-driven rather than based on technical considerations. The new environment differs markedly from the one inwhich the systempreviously has been operated. This leads to the requirement of somenew techniques and analysis methods for system operation, operational and long-term planning, and the like. Electrical power systemsvary in size, topography, and structuralcomponents. However, the overall system can be divided into three subsystems: generation, transmission, and distribution. System behavior is affected by the characteristics of every major element of the system. The representation of these elements by means of appropriate mathematical models is critical to the successful analysis of system behavior. For each different problem, the system is modeled in a different way. This chapter describes some system models for analysis purposes and introduces concepts of power expressed as active, reactive and apparent, followed by a brief review of three-phase systems (Section 111). Section V deals withmodeling the synchronous machine from an electric network standpoint. Reactive capability curves are examined in Section VI followed by discussion of static and dynamic load models in Section XI. 19
Chapter 2
20
II.
COMPLEXPOWERCONCEPTS
The electrical power systems specialist is, in many instances, more concerned with electrical power inthe circuit rather than current. As the power into an element is basically the product of the voltage across and current through it, it seems reasonable to swap the current for power without losing any information. In treating sinusoidal steady-state behavior of an electric circuit, some further definitions are necessary. To illustrate, a cosine representation of the sinusoidal waveforms involved is used. Consider an impedance element 'z = Z L connected to a sinusoidal voltage source v(t) that is given by w(t) = Vnlcos wt. Figure 2.1 shows the typical load circuit.The instantaneous current in the circuit shownin Figure 2.1 is
+
i(t) = Im cos(wt - +), where the current magnitude is rm =
Vm/lzl
The instantaneous power is given by p(t) = v(t)i(t) = Vn,In1[cos(wt) cos(wt - +)] Using the trigonometric identity 1 cos a! cos /? = - [cos(a! - #I) cos(a! 2 we can write the instantaneous power as
+
+ /?)I
+
p(t) = =[cos q5 cos(2wt - $1 2 The average power pavis seen to be
V
FIGURE 2.1
Load circuit.
c
System
Power
21
Electric
Since through onecycle, the average of cos(2wt - #) is zero, this termtherefore contributes nothingto the average of p . It is more convenient to use the effective (or root mean square) values of voltage andcurrentthanthe maximumvalues. Substituting Vm = fi(vm,y>, and Zm = fi_ 0 and, hence, it cannot bepositivedefinite.Let us try for maximumbyreplacing Lx.Xby -Lvyx.That is,
It is positive definitefor p =- A4/6 and, therefore, the solution of Step 2 is a maximum. 4. The maximum function is
where 2 5 K < 00. Further maximization yields K = 2 for which f (x) = -3, x1 = -2, and x2 = -1. Problem 4.2 Optimizef(y) = y T P y + C T y subject to linear constraints A y = d where P is a symmetric n-square and nonsingular matrix. The matrix A is ofrn x n with full rank and C and d are n- and m-vectors.
~
.
Constrained Optimization and Applications
123
Solution 1.
The constraints are Ay = d and, hence,
2.
+ C T y + RAy. L, = 2xTP+ CT + AA = 0. The solution of x and R is L =yTPy
X
1
+C),
= - - P" (ATIT
2
+
where AT = -(AP" AT)-' (AP" C 2d). L, PFrF, = 2P /?ATA.The solutionof Step 2 is a minimum if 2P DATA > 0 and is a maximum if 2P BATA > 0 for some B L 0. 4. f ( x ) = $ (LA - C T )P-' ( A T I T C ) . a. Minimum norm of Ay = d with m 5 m. The problem is a special case of P = 1 and C = 0. It follows from the result of Step 2 that
+ +
3.
+
+
+
I T = -2(AAT)-'d
and x = AT(AAT)-'d.
The sufficiency for theminimum is assured by chopsing B = 0. The minimized norm is: f ( x ) = x T x = dT(AAT)- d. b. Least squareapproximationfor A x = d with mn. The problem is to minimize, without constraint, f 0.))= eTe = ( A x - d)T(Ax- d)
= x T A T A x- 2dTAx
+ dTd.
Let P = A T A and C = -2ATd; then the result of Step 2 yields x = (ATA)-'ATd and eTe = d T k - A(ATA)"AT]d.
The norm is a minimum because A is of rank n and hence A T A > 0. The sufficiency is assured again by choosing /3 = 0. A.
Non-PowerSystemsApplicationExamples
Problem 4.3 Consider the function f ( x l ,~
2 ~, 3 =) x1
+ 2x3 + ~ 2 x 3- X:
- x f - X:.
Chapter 4
124
Solution Applying the necessary condition Vf(X0)= 0, this gives
!a = 1 - 2x1 = 0. ax1
!a = x3 - 2x2 = 0, ax2
!a = 2 + x2 - 2x3 = 0.
3x3 The solution of these simultaneous equations is given by x0
=
1 2 4
(z'5.5).
Another way to check the sufficiency condition is to check the Hessian matrix H for positive or negative definiteness. Thus,
[ ; 1' -2
=
0
!2].
The principalminor determinants Hlx, have the values -2, 4, and -6, respectively. Thus, HIx, is negative definite and
represents a maximum point.
V.
POWERSYSTEMSAPPLICATIONEXAMPLES
A.
Optimal Operation of an All-Thermal System: Equal Incremental Cost-Loading
A simple, yet extremely useful problem in optimum economic operation of electric power systems is treated here. Consider the operation of m thermal
Constrained Optimization and Applications
125
\
"I
"I
........................
FIGURE 4.1 Units on the same bus.
generating units on the same bus as shown in Figure 4.1. Assume that the variation of the fuel cost of each generator (Fi ) with the active-power output (Pi)is given bya quadraticpolynomial. The totalfuel cost of the plantis the sum of the individual unit cost converted to $ / h :
where ai,pi, and 'yi are assumed available. We wish to determine generation levels such that F is minimized while simultaneously satisfying an active-power balance equation. Thisutilizes the principle of power-flow continuity. Here, the network is viewed as amedium of active-power transfer from the generating nodes to the load node. Only one equation is needed. The first active-power balance equation model neglects transmission losses and, hence, we can write
i= 1
with Po being a given active-power demand for the system. The demand P o is the sum of all demands at load nodes in the system. The model is useful inthe treatment of parallel generating units at the same plant since in this case the negligible transmission losses assumption is valid. We write the constraint equation, equation 4.7, m
Po - C ( P i )= 0. i= 1
Chapter 4
126
The technique is based on including equation 4.8 in the original cost function by use of a Lagrange multiplier, say A, which is unknown at the outset. Thus,
where
Note that 1 is to be obtained such that equation 4.8 is satisfied. The idea here is to penalize any violation of the constraint by adding a term corresponding to the resulting error.The Lagrange multiplier is,ineffect, a conversion factorthat accounts for the dimensional incompatibilities of the cost function ($/h) and constraints (MW). The resulting problem is unconstrained, and we have increased the number of unknowns by one. The optimality conditions are obtained by setting the partial derivatives of L with respect to Pi to 0. Thus, aFi = 0. api
-1
(4.10)
Note that each unit’s cost is independent of the generations of other units. The expression obtained in equation 4.10 leads to the conclusion that (4.1 1) The implication of this result is that for optimality, individual units should share the load such that their incremental costs are equal. We can see that the 1 is simply the optimal value of incremental costs at the operating point. Equation 4.10 is frequently referred to as the equal incremental cost-loading principle. Implementing the optimal solution is straightforward for the quadratic cost case where we have
Our optimality conditions from equation 4.10 reduce to Bi
+ 2yiPi - A = 0.
(4.12)
The value of A is determined such that equation 4.8 is satisfied. This turns out to give
Applications and Optimization Constrained
127
(4.13)
Finally, using equation 4.12 the optimal generations are obtained as (4.14) B. Optimal Operation of an All-Thermal System, Including Losses
We are interested in minimizing the total cost given by equation 4.6 while satisfying the active-power balance equation including losses. Thus, rn
(4.15) i= 1
Here PL is the active-power loss considered as a function of theactive-power generation alone as outlined in the previous section. Following our treatment for the loss-free case, we form the augmented cost function: (4.16) The optimality conditions are obtained using the same arguments as before and are (4.17) Note that with negligible transmission losses, the above expression reduces to equation 4.19. It is convenient to transform the obtained optimality expression into an equivalent form. This is done by defining the factors Li: (4.18) We can write equation 4.17 as aFi Li-=A(i= 1, ...,m). aPi
(4.19)
This is of the form of equation 4.1 1 except for the introduction of the new factors Li, which account for themodifications necessitated by including the
Chapter 4
128
Read Data
e(max) Constrnmt Boundaries
e(,,,,),
PD
A(’) N
Fuel Cost : Maxlmum No. of Iterations ai,h,yi : Coeffictents
&
m
: Total Load
. InitialIncremental
Tolerance Value
. Total No of Thermal Units
T
Calculate
T
e using
n=l+l
‘ I Display: ‘Wonconvergenceof Solution.”
FIGURE 4.2 Economicdispatch flow chart. (a) Neglectingthecontributions transmission losses. (b) Accounting for the effects of transmission losses.
of
Constrained Optimization arid Applications
129
FIGURE 4.2 (continued)
transmission loss. These are traditionally called the penalty factors to indicate that plant costs (Fi) are penalized by the corresponding incremental transmission losses (aPL/aPi). Examination of equation 4.19 reveals that the optimalgenerations are obtained when each plant is operated such that the penalized incremental costs are equal. Flowchartsforthe twocases of economic dispatchare shown in Figure 4.2.
Problem 4.4
A certain power system consists of two generating plants and a load. The transmission losses in the system can be approximated by the following quadratic equation.
+
PLoSs= 0 . 6 0 6 ~ 1 0 - ~ P :0 . 4 9 6 ~ 1 0 - ~ P i .
The cost functions of the units are modeled such that:
Chapter 4
130
Read Data: Pl(mn) . ,
e(mx): ConstralntBoundaries
Po
Total Load Eii's B-Coefficients of theNetwork
N . Maximum No of lterat~ons al,Pi,yi.Coefficients A(') : lnltial Incremental Fuel Cost M . Incremental changes of lambda E : ToleranceValue m :Total No. of Thermal Units
+
l lnit~alizntion - l n=l I
c-
Solve.
6
=
I
i# j 2yBg - 2 Y i
Iteratively for VP;'s
n=n + I
I
4 = 4ll-i"
FIGURE 4.2 (continued)
I
I
v Display "Nonconvergence of Solution."
I
Constrained Optlmlzation and Applications
131
P i=I
FIGURE 4.2 (continued)
/31
= 10.63
~1
= 3.46~10-~
/32
= 12.07
~2
= 3.78~10-~.
Given that the output of plant 2 is 445 MW while it is being operated under optimal conditions, determine the: 1. incremental cost of operating the units, assuming the equal cost sharing; 2. output of plant 1; 3. total power losses; and 4. efficiency of the system. Solution 1.
For economic dispatch (optimal power flow, see Figure 4.2) at plant 2,
Chapter 4
132
- 12.07 + 2(3.78~10-~)(445) 1 - 2(0.496~ 10-3)(445) II = 27.63 ($/MWh). 2. Similarly, for the optimal operation of plant 1:
-
2(0.00346 = 421 MW.
27.63 - 10.63 (27.63)(0.606~10-~))
+
3. Hence, the total power losses due to transmission are:
+
PLoss = 0 . 6 0 6 ~ 1 0 - ~ ~ ( 4 2 10).~4 9 6 ~ 1 0 - ~ ~ ( 4 4 5 ) ~
+ 98.2 = 107 + 98.2 = 107
= 206 MW.
4. The efficiency of the system is given as: q = - Porrlpul Pinput
=I -
(3)
206 MW + 445) MW = 0.763 or 76.3%.
-
- (421
Constrained Optimization and Applications
133
VI.ILLUSTRATIVEEXAMPLES Example I Find the local and global minima of the function =f(x1,
f(X)
X2)
= X:
+X:
- 2 X 1 - 4x2 4- 5
1x11, 1x21 5 3.
Solution
s.t.
5 3 -+ -3 5 X] 5 3 1x21 5 3 + -3 5 x2 5 3. 1x1J
We can change the constraint to be an equality constraint
- kl,
-3 5 kl 5 3
= k2,
-3 5 k2 5 3.
XI
x2
1.
Form the Lagrange function.
L = X: 2.
+
X:
- 2x1 - 4x2 f 5 +fix1
+f2X2.
Determineoptimumcandidates.
aL
-= 2x2 - 4 +f2 = 0 + x2 = 2 - 0.5f2. 3x2
3.
SufjciencyTest.
L,,
4.
+ ~ F Z F> o~
for some positive p,
Then it is at a minimum. Further optimization. K I = K2 = K ,
as - 3 5 K 5 3
F(K)=K2-2K-4K+5=2K2-2K+5.
For minimization FK = 4K - 2 + K = 0.5.
Chapter 4
134
Example 2 Two thermal units at the same station have the following cost models.
+ 7.74P1 + 0.00107# F2 = 1194.6 + 7.72P2 + 0.00072@ FI = 793.22
100 5 P2 5 800MW and
100 5 PI 5 600MW.
Find the optimal power generated P1 and P2 and the incremental cost of power delivered for demands of 400, 600, and 1000 MW, respectively. Solution aF1 - 7.74 ap1
"
+ 0.00214P1
For optimality, aF1 -=-
apl 7.74 7.72
aF2
a~,
+ O.OO214P1 = 3, -+ PI = 467.29(3,) - 3616.8 + O.OO144P2 = 3, -+ P2 = 694.44(h) - 5361.1 P D = P1 + P2. 1. At P D = 400MW, Pl + P2 = 400. 400 = 467.293,- 3626.8 + (694.44A- 5361.1) 9377.91 = 1 161.733, -+ 3, = 8.072 PI = 155.3 MW, P2 = 244.7 MW + both of them are within the desired limit. 2. At Po = 600 MW, PI P2 600MW.
+
600 = 467.29(3,) - 3616.8
+ (694.44(3,)- 5361.1)
then
3,= 8.245, PI
= 235.78 MW,
P2 = 364.22MW;
Constrained Optimization and Applications
135
both of them are within the desired limit. 3. At P D = lOOOMW, PI P2 = 1000 MW.
+
then ;Z = 8.589,
PI = 396.7 MW,
P2 = 603.33 MW;
both of them are in the desired limit. Example 3 The following fuel cost equations model a plant consisting of two thermal units.
+ 9.2P1+ 70.5 in $ F2 = 0.004558 + 6.0P2 + 82.7 in $. Fl = 0.00381P:
1.
For a load of 600 MW, determine the optimal power generated by each unit and the equal incremental fuel cost Iz at which it operates. 2. For the same loadas in Part 1, given thatthegeneration is constrained as 80.0 5 PI 5 250 MW 120 5 P2 5 400MW
at what values of A should the units be operated? Solution 1.
Fromthe given data,
By the formula for the incremental fuel cost,
a= where
Chapter 4
136
9.2
Bi
= 0.0038 1
6.0 += 3.733 x io4 0.00455
1
o.00381+--0.00455 - 4.822 x lo2
Ai
+
2(600) 3.733 x lo4 4.822 x lo2 = 10.23 $MWh,
:.
J. =
at which point, ( W i / a P i )= A, for all i, such that
+
2(O.OO381)Pl 9.2 = 10.23 =+PI= 135MW.
+
Similarly, 2(0.00455)P2 6.0 = 10.23 =+P2= 465 MW. Notably, Po = PI P2 (the power balance equation). 2. For equal incrementalfuelcosting, there is a violation on the given constraint. Therefore, since P2 is greater P2max,let P2 assume its upper bound 200MW. Thus,
+
P1
= Po - P2 = (600 - 400) = 200 MW
which is within the desired limits. The incremental cost for each unit calculated is (aF1/aPi)= Ai, for all is
from
A1 = O.OO762P1 + 9.2 = (0.00762 X 200) + 9.2 = 10.72 $/MWh and J.2 = O.OO91OP2
+ 6.0 = (0.00910
X
400) + 6.0 9.64
$/MWh.
Example 4 Consider a power system that is modeled by two generating plants such that their cost function parameters are:
p2
= 5.93
~1
= 4.306~10-~
= 6.02
~2
= 4.812~10-~.
Given also that the network has the following B-coefficients, B11 = 3 . 9 5 ~ 1 0 - ~ B22 = 4.63~10-~,
"
Applications and Optimization Constrained
137
the system load is 700 MW and the constraints on the generation are: 100 5 Pgl 5 500 MW 80 5 Pgl 5 5 0 0 MW
1. Calculatethe incremental cost andtheoptimal value forthe plants’ output. 2. Neglecting the transmission losses, repeat Part 1 while considering the following generation constraints. Solution
1. Generally, thequadraticapproximation
of thecostfunction
modeling the ith plant is: ~ i ( p i= ) ai
+
Pipi
+yie,
and the transmission losses can be expressed as m
m
where m = number of units Bo = loss coeficients. For economic dispatch of power from the units with equal cost sharing:
From the data given, we observe that PI0,y.y = 4
1
p: + B22&
4 2
= 0.
+
Therefore, for plant 1: - / I 1 2yl P1 = y( 1 - 2B11PI) and for plant 2 : -P2 2y2P2 = ;I( 1 - 2B22P2),
+
Using P2 = PD - P I , further manupulation of the loss equation with substitution of the constants gives:
+ 7.5897x104P1 - 2.86767~10’= 0 .*. P2 = PD - P1
j
P I = 376MW
= 700 - 376 = 3 2 4 M W .
Chapter 4
138
Now, from the equation for plant 1,
- 5.93 + ( 2 ~ 4 . 3 0 6 ~ 1 0 - ~ ~ 36 0 1 - 13.042 $MWh. -79.1681 1 - ( 2 ~ 3 . 9150-4~376) ~ 0.70296 ”
The incremental cost for the plants is 13.042 $MW. 2. Neglecting the transmission losses implies that (aPL,,v,y/aPi)= 0. Hence, for optimal power flow (economic dispatch) at each plant aFT
-0
”
api
Pi + 2yiPi = hi for v i .*. 81 + 2ylP1 = A2P2 = 1 2 , and Assuming that AI = A2 = A, then
P2 = P D
- PI.
81 -k 2ylp1 = 82 + 2y(pD 6.01 - 5.93 ==+A= 82 -81 2(y1 = 655 MW.
+ ~ 2 -) 2(4.306 - 4.812)~10-~
= 500 MW; therefore we mustset PI to 500 MW. It But follows that the incremental cost for each unit must then be recalculated. Now, P2 = P D - PI = 700 - 500 = 200 MW, a value within the desired limits of P2. Therefore, AI = PI 2ylPI = 5.93+ (2)(4.306 x 10-3)(500) = 10.236 $MWh and A2 = P2 2y2P2= 6.01 (2)(4.812 x 10-3)(200) = 7.935 $MWh. This is an example of a much-simplified iterative process whereby the constraints impose limitations on the desired values of Ai for each unit.
+
+
VU.
+
CONCLUSION
This chapter handled practical system problems formulated as constrained optimization problems. The definition of admissible points was presented as those vectors that satisfy the constraints of the problems. It was shown that constrained problems are assumed to have at least one admissible point. The so-called Lagrange multipliers were used as aneffective way of dealing with constraints. In Section I1 theorems on the optimization of constrained functions werepresented and necessary and sufficient conditions defined. In
and Applications
Optimization Constrained
139
Section 111 a procedure for optimizing constrained problems was stated in the formof sequential steps. Section IV presented some solved problems for illustration and validation purposes. Finally, Section V presented power system application examples suchas optimal operationof all thermal system incremental cost loading. VIII. PROBLEM SET Problem 4.1 Consider the following problem. Maximize Subject to:
+
xi - 4x1~2 x22 2 x1 x2 = 1.
+
1. Using the Kuhn-Tucker conditions, find an optimal solution to the problem. 2. Test for the second-order optimality condition. Does the problem have a unique optimal solution? Problem 4.2 Consider the following problem: Maximize Subject to:
+ + + +
3x1 - ;2 x$ x1 x2 x3 5 0 -x1 - 2x2 x32 = 0.
1. Write the Kuhn-Tucker(K-T) optimal conditions. 2. Test for the second-order optimality conditions. 3. Argue why the problem is unbounded. Problem 4.3 Find all local minimumand maximum points of the following functions for the four indicated domains. Be sure to list each of the specified points along with the corresponding functionalvalue at the particular point. Functions:
+ +
+ + +
1. F(x,,v) = x4 6x2y2 y4 - 2x2 - 2y2. 2. F ( x , y ) = x; - 3x2 4x y 2 . 3. F(x, y ) = x y4.
Domains:
1. {(x,y) : x; + y ; 5 1). 2. { ( x , y ) :x + y 1 2 ) .
Chapter 4
140
Problem 4.4 Find the local and global maxima and minima of the function
I
Problem 4.5 Two electrical generators are interconnected to provide total power to meet the load. Each generator's cost is a function of the power output, as shown in Figure 4.3. All power costs are expressed on a per unit basis. The total power need is at least 60 units. Formulate a minimum cost design problem and solve it graphically. Verify K-T conditions at the solution points and show gradients of cost and constraint functions on the graph. Problem 4.6 Repeat Example 3 for a transmission loss equation given by Pt = 0.08P2.
All other data are unchanged. cost cost
FI
F2
4
4
0
p, 1
6 = 1.21 - 1.OOP,, + 1.OOP,: FIGURE 4.3
,:
!.
!.
.
1
2
3
4
p,
Fz= 0.34 + 0.5 lPg2+ 0.48P,',
Constrained Optimization and Applications
141
REFERENCES Heterse, M. R. Optimization Theory, the Finite Dimensional Case, Wiley, New York, 1975. 2. Hillier, F. S.,and Lieberman,G. J., Introduction to Operations Research, 4th ed., Holden-Day, San Francisco, 1986. 3. Luenberger, D. G . Optimization by Vector Space Methods, Wiley, New York, 1.
1969. 4. Luenberger, D. E. Introduction to Linear and Nonlinear Programming,AddisonWesley, Reading, MA, 1975. 5. Mokhtar, S., Bazaraa, C. and Shetty, M.,Nonlinear Programming, Theory and Algorithm, Wiley, New York, 1979. 6. Pierre, D. A. Optimization Theory with Application, Wiley, New York, 1969. 7. Potter,W.A. ModernFoundation of SystemsEngineering, Macmillan,New York, 1966. 8. Zangwill, W.1. Minimizinga function without calculating derivations, Computer Journal, Vol. 10, 1967, pp. 293-296.
This Page Intentionally Left Blank
Chapter 5 Linear Programming and Applications
1.
INTRODUCTION
The most general description of the linear programming (LP) problem is given as theproblem of allocating anumber m of resources among 1,2, . . ., n activities in such a way as to maximize the worth from all the activities. The term “linear” refers to the fact that all the mathematical relationships among the decisions (variables) to allocate resources to activities and the various restrictions applicable therein (constraints), as well as the criterion (objective function) aredevoid of any nonlinearity. The objective function is some measure of the overall performance of the activities (e.g., cost, profit, net worth, system efficiency, etc.). Standard notation for linear programming is summarized in Table 5.1. For activityj, cj (j= 1 , . . . ,n) is the increase in P (i.e., AP), that would result from each unit of increase in xj (the level of activityj ) . For resource i, i = 1 , . ..,m,bi is the total amount available for allocation to all the activities. The coefficient aii denotes the amount of resource i consumed by activity j . The set of inputs (av,bi,cj) constitutes the parametersof the LP model.
II. MATHEMATICALMODELANDNOMENCLATURE IN ’ LINEAR PROGRAMMING The conventional linear programming model reduces to the following problem. 143
Chapter 5
144
TABLE 5.1 NotationsCommonlyUsedinLinearProgramming Activity
Resources
1
2
m
I
APlunit of activity
I
2
. . .
n
a1 1 a21
aI2 a22
. . . . . .
a1 n a2n
am1
am2
. . . . . . . . .
amn
cl
q
. . .
Cn
Maximize P = cTx
Total Resources
( 5 - 1)
Subject to: Axsb .xj
20,
v
j € {l,n},
where the following vectors are defined. Decision matrix: Cost coefficient array: Constant array:
x = [ x , ,x2, . . . , XillT. cT = [cl,c2, . . . , till.
b = [bl,b2, . . . ,bn,lT.
System or state matrix:
A=
The following are important terminologies used in linear programming.
Applicationsand Programming Linear
145
Objective Functions and Constraints. The function P(x) being maximized is called the objective or goal function subject to the restriction sets of constraints givenby equations 5.2 and 5.3. Equation 5.2 represents a restriction set that is often referred to as the functional constraints and equation 5.3 is termed the nonnegativity constraints. Feasible Solution and Region. Any specification of the variable x. is called a solution. Afeasible solution is one that satisfies all constraints. The feasible region is the collection of all feasible solutions. If the problem does not have any feasible solution, it is called an infeasible problem. Optimal Solution. An optimal solution corresponds to the minimum or maximum value of the objective function. The problem is either one of minimization or maximization depending on the nature of the objective function under consideration, that is, cost and profit, respectively. Multiplicity in Solution. There can be multiple optimal solutions in cases where a number of combinations of the decision variables give the same maximum (or minimum) value. UnboundedSolutions. There may also be unboundedsolutions in that the linear programming problem objective function could be infinitely low or high depending on minimizing or maximizing cases, respectively.
A.
ImplicitAssumptionsinLinearProgramming
There are certain assumptions utilized in linear programming models that are intended to be only idealized representations of real problems. These approximations simplify the real-life problem to make it tractable. Adding too much detail and precision can make the model too unwieldy for useful analysis. All that is needed is that there be a reasonably high correlation between the prediction of the model and what would actually happen in reality. The four assumptions are asfollows. (a) Proportionality. This relates to the assumption of constant-cost coefficients cj irrespective of the level of xj. Stated alternatively, there can be correlation between the coefficients cj and xi’. For example, the incremental cost associated with additional units of activity may be lower, which is referred to as “increasing marginal return” in the theory of economics. The proportionality assumption is needed essentially to avoid nonlinearity. (b) Additivity. Apartfrom the proportionalityassumption,the additivity assumptions are needed to avoid cross-product terms among decision variables. This assumption amounts to stating
Chapter 5
146
that the total contribution from all activities can be obtained by adding individual contributions from respective activities. (c) Divisibility. The decision variables can take any fractional value between specified limits. In other words, the variables can not be restricted to take some discrete values or integer values. (d) Certainty. All the parameters in the model are assumed to be known constants with no uncertainty about thevalues that these parameters may assume. It is important in most casesto study and analyze the disparities due to these assumptions by checking with more complex models. The more complex alternative models are a result of relaxing the assumptions regarding nonlinear programmingmodels due to relaxingAssumption(a) and/or Assumption (b), integer programming mdoels by relaxing Assumption (c), and stochastic programmingmodels by ignoring Assumption(d). More complex models are obtained by dropping various combinations of Assumptions (a) to (d). 111.
LINEARPROGRAMMINGSOLUTIONTECHNIQUES
Various approaches havebeendevelopedovertheyears to solvelinear programming problems. The commonly encountered techniques that have gainedwide attention from engineers, mathematicians, and economists are the graphical approach, the simplex method, the revisedsimplex method, and the tableau approach. In the text, we omit the treatment of the tableau approach as its flexibility in the development of programs for morecomplex algorithms outdoes itsability to solvelarge-scale problems. A. TheGraphicalMethod In the linear programming problem, the optimal solution (if it exists) lies at one of the corner points of the polytope formed by the boundary conditions of the functional and nonnegativity constraints of the problem. In the graphical method, each corner-point solution, which is also a feasible solution, is checked using the objective function. The solution that yields the greatest improvement to the objective value is the optimal solution to the problem. The graphical technique for solving LP problems is demonstrated by the following example. Consider the following two-variable linear programming problem. Maximize
P = 3x1+ 5x2
Linear Programming and Applications
147
Subject to:
A two-dimensional Figure 5.1 can be constructed correspondingto the two variables x1 and x2. The nonnegativity constraints automaticallyimply that the search is restricted to the positive side of the x1 and x2 axes. Next, it should be observed that the feasible solution can not lie to the right of the line xl = 4, because of the restriction x1 5 4. In the same way, the restrictions 2x2 5 12, and 3x1 2x2 5 18 provide the other two cuts to generate the feasible region in Figure 5.1 as indicated by the shaded area OABCD.
+
P =36 P-30
P=l5
......
. I
P=O
FIGURE 5.1 Graphical representation of the two-dimensional problem.
Chapter 5
148
The vertices A, B, C, D and the origin of the polytope are the corner point solutions to this problem and atleast one of suchpoints represents the optimal solution set for x1 and x2. We must now select the point in this region that maximizes the values of P = 3x1 5x2. Different lines corresponding to different values ofP are drawn to check the point beyond which there is no point of the feasible region, which lieson the line P = 3x1 + 5x2. This is achieved by drawinga line for P = 0 and then progressively increasing it to 15, 30, and on. It is found that the optimal solution is:
+
and the maximum value of P is:
P
= 36.
This is shown in Figure 5.1 as vertex B of the polytope. B. MatrixApproach to LinearProgramming The matrix approach is quite effective for the computer-based solution of LP problems. It generallyfirstrequires a matrixinversion. The simplex method equipped with an artificially constructed identity matrix for initialization is called the revised simplex method. The revised simplex method and a flowchart for computer programming are discussed with illustrative problems. The integer programming (IP) problem is to be solvedas anLP in conduction with Gomorycut, whicheliminates the nonintegral optimal solution step by step. The purpose of linearprogramming is to solve the problem formulated in the following standard form. Maximize
P = cTx
(5.4)
Subject to: Ax=b
where
(5.5)
Linear Programming and Applications
149
b i > O f o r a l l i = 1,2, ...,m xi 2 0 for a l l j = 1 , 2 ,..., n.
We now definethe following arrays in preparation for formulating the linear programming problem in this chapter. (a) m-vectors
b=
Y= Ym
["1,
(b) n-vectors
U =
x = [
un
:I,
c = [ c 1 , c2 , . .., c,].
Xn
(c) n x m vectors for the case when n > m, Z = [zI,
22,
*
~n-m].
- 9
(d) Matrices
A = [aulrnxn =
. - aln 1 .. . .. . '. .. a2n . ..
rail a21 a22
1
L am1 am2
a12
*
'
1.
amJ
Now let B represent an m x m matrix and B represent an m x (n - m) matrix such that A = LB, BJ.Matrix A is nowsaid to be augmented into two components. The state vector x is a feasible solution (or simply called a solution) if it satisfies the constraint equations 5.5 and 5.6. It is a nondegenerate basic feasible solution (or simply called a basic solution) if it contains exactly m positive components with all others equal to zero. It is a maximum feasible solution (or simply called a maximum solution) if it is a solution and also maximizes the objective function P. In practical applications, agiven problem may not appear thesame as the standard form. For example, the state variable xi may be bounded, the constraitns may be inequalities, or the objective function may require minimization. In most cases, the following rules can be useful in converting a practical problem to the standard form.
s"""T"*""
,
.
", ,
I___"
Chapter 5
150
Rule 1. Change of Constraints on State Variables
The bounded state variable of the following forms can be converted to the inequality form, equation 5.6.
+
xi 5 di, then add a slack variable x, to make xi x, = di or nonnegative. replace xi by di - whereis 2. If xi 3. dj, then add a slack variable x, to make xi - x, = di or replace xi by di. 3. If -00 < xi < 00,then replace xi by x, - ~ . ~ + l . 4. If di 5 xi 5 hi, then the constraint is equivalent to 0 5 xi - di 5 hi - di and hence xi is to be replaced by x, di and x, x,+l = hi - di is a new constraint.
1. If
+
+
Rule 2.
+
Conversion from Inequality to Equality Constraint
We use (A,),. to denote the ith row of the vector (Ax) in the following.
1. If di 5 5 hi, then two slack variables are needed to make (Ax)i - x, = di and (Ax)i + = hi. 2. If (Ax)i = bi 5 0, then multiply both sidesby - 1 when bi 0, and create a new constraint by adding it to another constraint when bj = 0. Rule 3. Modification of Objective Function
1. Minimization of P.Any solution for max(-P) is the same as that for min(P), but min(P) = - max(-P). 2. Maximization of IPI.Find both max(P) and min(P) subject to the same constraintand then select the larger of I max(P)I or I min(P)I. C. TheSimplexMethod
The geometricmethodcan not beextended to large-scaleproblems for which the numberofdecision variables can run into several thousands. The simplex method, developed by Dantzig in 1947, and its variants have been widely used to date and a number of commercial solvers have been developed based on it. The first step in setting up the simplex method converts the inequality constraints into equivalent equality constraints by adding slack variables. Recall the following example. Maximize Subject to:
P = 3x1+ 5x2
Linear Programming and Applications
151
This example illustrative problem can be converted intothe following equivalent form by adding three slack variables x3, x4, and xs. Thus, we obtain:
P = 3x1 + 5x2
Maximize Subject to:
+
x1 x3 = 4 2x1 +x4 = 12 3x1 +2x2+xs = 18 xj
10,
v j E {1,2).
This is called the augmented form of the problem and the following definitions apply to the newly formulated linear programmingproblem. An augmented solution is a solution for theoriginal variables that has been augmented by the correspondingvalues of the slack variables. A basic solution is an augmented corner-point solution. A basic feasible solution is an augmented corner-point feasible solution. For example, augmenting the solution (3,2) in the example yields the augmented solution (3,2,1,8,5); and the corner-point solution (4,6) is translated as (4,6,0,0,-6). The main difference betweena basic solution and a corner-pointsolution is in whether the values of the slack variables are included. Because the terms basic solution and basic feasible solution are integral parts of linear programming techniques, their algebraic properties require furtherclarification. It should be noted that there are five variables and only three equations in the present problem. This implies that there aretwo degreesof freedom in solving the system since any two variables can be chosen to be set equal to any arbitrary value inorder to solve the three equations for the three remaining variables. The variables that are currently set to zero by the simplex called basic variables. method are called nonbasic variables, and the others are
Chapter 5
152
The resulting solution is a basic solution. If all the basic variables are nonnegative, the solution is called a basic feasible solution. Two basic feasible solutions are adjacent if all but one of their nonbasic variables are thesame. The final modification needed to use the simplex method is to convert the objective function itself in the form of a constraint to get:
P - 3x1 - 5x2 = 0. Obviously, no slack variables are needed since it is already in the form of equality. Also, the goal function P can be viewedas a permanent additional basic variable. Thefeasibilityofachieving a solution for the linear programming problem as described by equations 5.4 through 5.6 relies on a set of optimality conditions. In solving the problem, the optimality conditions are forced to besatisfiedbymeansof the simplexmethod,involving a systematic process of selecting a basis to increase the objective function P until it can no longer be increased. We assume that the problem is nondegenerate; that is, all solutions are basic. The following notations are used in the statement and the theorem in this chapter. an m-square and nonsingular matrix formed by m columns of A . an m-vector of which the components are the positive components of a basic solution x corresponding to B; that is, X B = B"b. an m-row vector formed by rn components of C that correspond to x B ; that is, P = CX = C B X B . an (n - m)-row vector makes C = ( C B , C B ) . the kth component of C B . the kth column of the matrix B. (n - m)-row vector defined by 2 = CBB% the kth component of 2.
Theorem 5.1: The objective .function P can be increased if zk - CBk -= 0 for some k, and the components of x B are the positive components of a maximum solution if z - C B 5 0. The theorem gives a condition on optimality that suggests a systematic approach to a maximum solution of the linear programming problem. It is assumed in the simplex method that initially there exists a basis, which yields a basic solution. In practical problems, such a basis is hard if not impossible to find, especially for large m. To overcome this difficulty, an identity matrix is created here for the initialization of the simplex method. The simplexmethod requires that the basis matrixbe inverted at each
Applicationsand Programming Linear
153
iteration. We note that there is an only one column change between two consecutive bases. This enables us to apply a lemma of matrix inversion to obtain the inverse matrix for the new basis from that of the old one. Consider the following example.
Introduce artificial variables x5 and x6 such that
Then, the matrix A of equation 5.5 has the form:
1
d=[l
1 1 1 0 0 2 1 0 1 01 2 - 1 -1 0 0 1
and, hence, an identity matrix is created by the last threecolumns. The basic solutioncorresponding to the basis of the identity matrix is, therefore, x4 = b l , x5 = b2, and x6 = b3. There are no more than m artificial variables because some state or slack variables can be utilized as artificial variables, such as x4 in this example. D. A Lemma of Matrix Inversion The equation,
is true if the inverses exist. The lemma can be applied generally to the nsquare matrixP,m-square matrix Q , n x m matrix HI, and mx n matrix H2. The integer m is practically less than n. We make use of the identity to generate a recursive formula of matrix inversion that is useful to linear programming.To this end, we assume that P = B", Q = 1, H I = Zk - a,, and H2 = J,?, where J, is the column matrix with rth component equal to one but all others equal to zero. Thus, the following result can be obtained for nonsingular matrixB.
Chapter 5
154
(a) I + H2B-IHI = I =I
+ JTB"(iik
+ J,TY - J, B T
-I
- a,)
a,
= I + y r - J,?J, = y r
where B"& = y = ly,,y 2 , . . . ,ymlT and a, is the rth column of the B matrix.
Substitutions of (a) and (b) into the lemma give
Let b, and bb be the general elementsof
(B
the matrics B"
and
+ H1H2)-l, respectively. Then, it follows from the above equations that bb = bo - 'Yb. ,
for i # r
Yr
- brj
"
for i = r
Yr
As such, the matrix inversion for the new basis can be derived from that of the old one according to equations 5.8 and 5.9. Now, the simplex method can be broken down into three stages, namely, 1. initialization, 2. iterative procedure, and 3. optimality checking. 1. Initialization Step
The simplex method can start at any corner-point feasible solution; thus a convenient one such as (0,O) is chosen for the original variables. Therefore the slack variables become the basic variables and the initial basic feasible solution is obtained at (0, 0,4, 12, 18).
155Applicationsand Programming Linear
2.
Iterative Procedure
The iterative procedureinvolves moving from one corner-point solutionto an adjacent basic feasible solution. This movement involves converting one nonbasic variable into a basic variable and simultaneously converting a basic variable into a nonbasicone. The former is referred to as theentering variable and the latter as the leaving variable. The selection of the nonbasic variable to be converted into a basic variable is based on improvement in the objective function P. This can be calculated from thenew representation of the objective function in the form of a constraint. The nonbasic variable, which contributes to the largest increment in the P value, is theentering variable. For the problem at hand, the two nonbasic variables x1 and x2 add per-unit contribution to the objective function as 3 and 5, respectively (the coefficients cj), and, hence, x2 is selected as the entering variable. The adjacent basic feasible solution is reached when the first of the basic variables reaches a value of zero. Thus, once the entering variable is selected, the leaving variable is not a matter of choice. It has to be the current basic variable whose nonnegativity constraint imposes the smallest upper bound onhow much the enteringbasic variable can be increased. The three candidate leaving variables are x3, x4, and x5. The calculation of the upper bound is illustrated in Table 5.2. Since x1 remains a nonbasic variable, x1 = 0 which implies x3 remains nonnegative irrespective of the value of x2. x4 = 0, when x2 = 6, and x5 = 0, when x2 = 9, indicate that the lowest upper bound on x2 is 6, and is determined when x4 = 0. Thus, x4 is the current leaving variable. 3. OptimalityTest
To determine whether the current basic feasible solution is optimal, the objective function canbe rewritten further in terms of the nonbasic variables:
TABLE 5.2 Calculation of UpperBound for Bound Upper Equation Variable Basic x3
x3 x4
x5
x5
x = 4 - XI x4 = 12-2x2 = 18 - 3x1 - 2x2
No limit x2~T=6tmin X2($=9
156
Chapter 5
Increasing the nonbasic variables from zero would result in moving towards one of the two adjacent basic feasible solutions. Because xI has a positive coefficient increasing x1 would lead to an adjacent basic feasible solution that is better than the current basicfeasible solution; thus, the current solution is not optimal. Speaking generally, the current feasible solution is optimal if and only if all of the nonbasic variables have nonpositive coefficients in the current form of the objective function. The flowchart of the simplex method is shown in Figure 5.2. E. TheRevisedSimplexMethod The so-called revised simplex method consists of two phases.
Choose a basic B and determine B"
t
4
I
B"bCalculate
Calculate CBB-I, and
z-c* = C , B " B - C , 1
"
Replacing
I
i=r ~~~~~~
Answer P
~~
A X B - 4)
Replace xB by
=e 4
and vector of the basis
+h
jjis to enter A
I
Determjne
'1 culate
FIGURE 5.2
=~-1z~
Flowchart showing the three stages of the simplex method.
Applicationsand Programming Linear
157
Phase I This first phase starts with an identity matrix for thebasis (an artificial basis) and the maximization of xo, where x0 = -(sum of all the artificial variables).
The coefficientsof the objective function x. are all equal to -1. If the maximized x0 is less than zero, then there is at least one of the artificial variables that is different from zero and, therefore, no feasible solution exists for the problem. We enter phase I1 if all the artificial variables are eliminated or max(xo = 0). Phase II The aim of Phase I1 is to maximize the goal function P of the problem with a matrix inversion for thebasis inherited fromPhase I. It shouldbe noted that the constraintsof both phases are thesame although theobjective functions are different. Both Phase I andPhase I1 can be implemented either by digital computers or by hand calculations in accordance with the flowchart. Illustrative Example Minimize
+ + x3
Q = 2x1 x2
Subject to the inequality constraints given by:
+
3x1 5x2
+ 2x3 2 16
4 x 1 - 2 ~ 2 + ~ 3 ~and 3 x i 2 0 foralliE{1,2,3} using
1. The simplex method 2. The revisedsimplex method. 1. SolutionUsing the Simplex Method
We convert the minimization problem to a maximization one by changing the sign such that:
P = -2x1 - x2 - x3. Utilizing two slack variables x4 and x5 and two artificial variables x6 and x7 we obtain the following equations.
Chapter 5
158
In matrix notation: r
A=[B
1
3
: B]=
5
2 - 1
4 - 2 1
0
- 1 : O l
and b = [16, 3IT.
Notably, B = B" = I . Thus, X B = B"b = Ib = [16, 3IT. First Iteration
=[2
1
1
which is positive definite.
= [O
0 4
:"I
=
and
x1 = x 2
=x3
=o
maxP=O+minQ=O.
+
Note: This solution is infeasibleas the second constraint4x1 - 2x2 x3 2 3 is not valid here as it is zero. The revised simplex methodshows this result.
159
Linear Programming and Applications
2. Solution Using the Revised Simplex Method
Phase 1
First Iteration Step 1
=[-1
=[-7
-1]1[4
3
-3-3
5
-2
2 - 1
1
1 11.
0
0 -1
1
The minimum of the previousrow is -7; then the first vector of B,x1 enters the basis k = 1. Step 2 Calculate
Now, r = 2 implies that the second vector of B is to leave the basis. Step 3 The new basic solution is:
XB==
I
3 =;T x7 = 3 - (4(3) =
{i}.
Chapter 5
160
Step 4 Now we calculate the inverse of the new basic
i#2
fori#r+
bh=bv--b, Yi Yr
bzl = b21 - p ) b 2 2 = 0 - (!)(1)
4
= -3
Y2
[BI =
[ -25
[C,]=[O
2 - 1 1 0
-21
0 -1
1
xh=[xh
x+
[i01.
Second Iteration Step I
=$[O
-21
5 -2
which is positive definite; then
2 - 1 1 0 I]-[-1
0
]
-CB
I f 0 01
Linear Programming and Applications
161
and all the artificial variables are not eliminated from the basis. Therefore we can not enter Phase I1 of the calculations as the function has an infeasible solution as shown in the simplex method. IV.
DUALITY IN LINEARPROGRAMMING
Linear programming problems exhibit an important property known as duality. The original problem is therefore referred to as the primal and there exist a relationship between the primal of the linear programming problem and its dual. Now, from equations 5.1 to 5.2, the general linear programming problem in its primal form may be expressed as Maximize
P(x) = cTx
Subject to the constraints: Ax 5 b
and
where A is an m x n matrix, x is a column n-vector, cT is a row n-vector, and b is a column m-vector. The linear programming model above may also be written as n
Maximize
P(x) =
CFXj
(5.10)
j= 1
Subject to the constraints: (5.1 1)
and
Chapter 5
162
xj 2 0 V i e (1,m) and
V j
E
(1,n).
Notably, the slack variables have not yet been introduced in the inequality constraints. By the definition of the dual of the primal linear programming problem expressed as equations 5.12 to 5.14, we obtain the following model. Minimize
Q(y) = bDy
(5.12)
Subject to the constraints: (5.13) and
where AD is an n x m matrix, y is a column m-vector, cD is a row m-vector, and bD is a column n-vector. Similar to the model for the primal case, the dual linear programming model above may also be written as in
Minimize
cy> =
bD,yi
(5.15)
i= 1
Subject to the constraints: (5.16) and yi L 0 V i € (1,m) and
V j E (1,n).
The dual problem can now be solved using any of the previously discussed solution techniques such as the revisedsimplex method. In addition, the
Applicationsand Programming Linear
163
duality model exhibits special properties that can be summarized using the following well-known duality theorems. Theorem 5.1.
The dual of the dual linear programming model is the primal.
Theorem 5.2. The value of the objective functionP ( x ) for any feasiblesolution of the primal is greater than or equal to the value of the objective function Q ( y ) f o rany feasiblesolution of the dual. Theorem 5.3. The optimum value of the objective firnction P ( x ) of the primal, if it exists, is equal to the optimum value of the objective functionQ ( y ) of the dual. Theorem 5.4. The primal has an unbounded optimum has no feasible solution. The converse is also true.
if and only if the dual
Finally, we make note of the fact that the duality principle, when applied correctly, can reduce the computation needed to solve multidimensional problems. This is attributed to the fact that the large number of constraints are now modeled as the objective function and the oldobjective with smaller dimension has become the new constraints. Also, the duality concept can be applied to problems that are associated with the inverse of the A matrix in the primal model of the linear programming problem.
V.
MIXEDINTEGERPROGRAMMING
Integer and mixed-integer programming problems are special classes oflinear programmingwhere all or some of the decision variables are restricted to integer values. There are many practical examples where the “divisibility” assumption in linear programming needs to be dropped and some of the variables can take up only discrete values. However, even greater importance can be attributed to problems where the discrete values are restricted to zero and one only, that is, “yes” or “no” decisions or binary decision variables. In fact, in many instances, mixed-integer programming problems can be reformulated to have only binary decision variables that are easier to handle. The occurrence of binary variables maybe due to a variety of decision requirements, the most common among which are the following.
1. OlV/OFF decisions. The most common type of binary decision falls into this category for engineering optimization problems. This decision variable can also have alternative representation of GO/NO GO, BUILD/NOT BUILD, or SCHEDULE/NOT SCHEDULE, and so on, depending on the specific application
Chapter 5
164
under consideration in short, medium, and long-term planning contexts. Binaryvariables can also 2. LogicalEither-ORIANDconstraints. indirectly handle mutual inclusive or exclusivity restrictions. For example, there might becaseswhere a choice can bemade betweentwo constraints, so that only one must hold. Also, there could be cases where process B must be selected if process A has already been selected. 3 . K out of N constraints must hold. Consider the casewherethe overall model includes a set of N possible constraints such that only some K of these constraints must hold (assuming K < N). Part of the optimization task is to choose which combination of K constraints permits the objective function to reachitsbest possiblevalue. In fact, thisis nothing but a generalizationof the either-or constraints, and can handle a variety of problems. 4. Function withN-possiblevalues. In manyreal-lifeproblems, the functions do not have smooth continuous properties, but can take up only a few discrete values. For example,considerthe following case. f ( x l ,..., x,,) = d l
or d2,.. . , d N .
(5.17)
The equivalent integer programming formulation would be: (5.18)
(5.19)
5.
and vi = binary (0 or 1) for i = 1 , . . . , N . Thefixed-charge problem. In mostproblems,itiscommon to incur a fixed-costlset-up charge when undertaking a new activity. In a process-engineering context, it might be related to the set-up cost for the production facility to initiate a run. A typical power system example is the start-up cost of a thermal-generating unit. This fixed charge is often independent of the length or level ofthe activity and, hence, cannot be approximated by allocating it to the (continuous) level of activity variables.
Mathematically, the total cost comprising fixed and variable charges can be expressed as
Applicationsand Programming Linear
165
(5.20) The mixed-integer programming (MIP) transformation would look like n
Minimize
2=c(Cjxj
+ Kjyj),
(5.21)
j= t
where
Pure integer or mixed integer programming problems pose a great computational challenge. While there exist highly efficientlinear programming techniques to enumerate the basic LP problem at each possible combination of the discrete variables (nodes), the problem lies in the astronomically large number of combinations to be enumerated. If there are N discrete variables, the total number of combinations becomes 2N!.The simplest procedure one can think of for solving an integer or mixed integer programming problem is to solve the linear programming relaxation of the problem (Le., allowing the discrete variables to take continuous value so that the mixed integer programming reduces to nonlinear programming) and then rounding the noninteger values to the closest integer solution. There are, however, major pitfalls: The resulting integer solution may not befeasiblein the first place; and 2. even if the rounding leads to a feasible solution, it may, in fact, be far from the optimal solution. 1.
Algorithmic development for handling large-scale integer or mixed integer programming problems continues to be an area of active research. There have been exciting algorithmic advances during the middle and late ’ 1980s. The most popular method to date has been the branch-and-bound technique and related ideas to implicitly enumerate the feasible integer solutions. A. The Branch-and-Bound Technique for Binary Integer Programming Problems The basic philosophy in the branch-and-bound procedure is to divide the overall problem into smaller and smaller subproblems and enumerate them
Chapter 5
166
in a logical sequence. The division procedure is called branching and the subsequent enumeration is done by bounding to check how good the best solution in the subset can be, and then discarding the subset if its bound indicates that itcannot possibly contain an optimal solution for the original problem. The general structure of the mixed integer programming problem is: ti
Maximize
P(x) =
cjxi
(5.22)
j= I
Subject to the constraints: (5.23)
and (5.24)
and xj is an integer V i E { 1, I ] . Assume for simplicity of notation that the first I variables are the integer decision variables. Beforeaformal description of the branch-and-bound prcoedure is given, the basic proceduresof branching, bounding, and fathoming are illustrated using a simple numerical example. Illustrative Example Consider the following integer programming problem. Maximize Subject to:
P = 9x1 + 5x2 + 6x3 + 4x4
Applicationsand Programming Linear
167
This is a pureinteger problem andexcept for its small sizeis typical of many of the practical decision-making problems. Branching
Branching involves developing subproblems by fixing the binary variables at 0 or 1. For example, branching on x1 for the example problem gives: Subproblem 1. (x1 = 0)
P = 5x2 + 6x3 + 4x4
Maximize Subject to:
+
+ +
3x2 5x3 2x4 5 10 x3 x4 I1 -x2
+
x3 I0 x4 < 0.
Subproblem 1.
Maximize
(XI
= 1)
P = 9 + 5x2+ 6x3 + 4 q
Subject to: 3x2
+ 5x3+ 2x4 5 4 x3 + x4 5 1 x3 I1
-x2
+ x4 5 0.
The proceduremay be repeated at each of the two subproblem nodes by fixing additional variables such as x2, x3, x4. Thus, a tree structure can be formulated by adding branchesat each iteration, which is referred to as the solution tree. The variable used to do this branching at any iteration by assigning values to the variable is called the branching variable. Bounding
For each of the subproblems, a bound can be obtained to determine how good its best feasible solution can be. Consider first the relaxed linear programming formulation for the overall problem, which yields the following solution.
Chapter 5
168
with P = 164. Therefore, P 5 16.5 for all feasible solutions for the original problem. This bound can berounded off to 16, because all coefficients in the objective function are integers; hence, all integer solutions must have an integer value for P. The bound for whole problem is P 5 16. In the same way, the bounds for the two subproblems are obtained: Subproblem I . with P = 9.
(XI,
x2, x3, x 4 ) = (0, 1,0, 1)
Subproblem 2. ( x I , x ~ , x ~ , = x ~( )l , f , O , f ) with P = 16.2.
Therefore, the resulting bounds are: Subproblem 1. P 5 9, Subproblem 2. P 5 16. Fathoming
If a subproblem has a feasible solution, it shouldbe stored as the first incumbent (the best feasible solution found so far) for the whole problem along with its value of P. This value is denoted P*, which is the current incumbent for P. A subproblem is said to be fathomed, that is, dismissed from further consideration, if: Test 1. its bound is less than or equal to P; Test 2. its linear programming relaxation has no feasible solutions; or Test 3. the optimal solution for its linear programming relaxation is an integer; if this solution is better than the incumbent, it becomes the new incumbent, and the test is reapplied to allunfathomed subproblems with the new larger P. Optimality Test
The iterative procedure is halted when there are no remaining subproblems. At this stage, the currentincumbent for P is the optimal solution. Otherwise, we return to perform one more iteration. The solution tree of the current example is provided in Figure 5.3. The markings F(l), F(2), and F(3) on Figure 5.3 indicate that the node has been fathomed by Tests 1, 2, and 3. For the generalbranch-and-bound approach inmixed integer programming problems, somedeviations are necessary to improve the efficiency of the algorithm. These include:
Linear Programming and Applications X1
x2
169
x3
x4
i
i
..................... ....... . ... ...................................... . .... - .. . .. .........
i
6
P* = 14 with x =[1, 1
01
U
FIGURE 5.3 Solution tree diagram for the integer programming problem.
1.
The choice of branching variable. The variables, which have a noninteger solution in the LP relaxation,are selected for branching. 2. Values assigned to the branching variablefor creating subproblems. Create just two new subproblems by specifying two ranges of values for the variable. 3. Bounding step. The bound of P is the optimal value of P itself (without rounding) in the linear programming relaxation. 4. Fathoming test. Only the integer decision variables need to be checked for integer solution to decide the fathoming node.
Thebranch-and-boundprocedure important steps.
is summarized in the following
Step 1. Initialization Set P* = “00. Apply thebounding step, fathoming step, and optimality test described below to the whole problem. If not fathomed, classify this problem as the one “remaining” subproblem for performing the first full iteration below.
”.
.
.,
j.__
,
.I .
.
0 -
Chapter 5
170
Step 2. Branching Among the unfathomed subproblems, select the one thatwas created most recently (breaking ties according to which has the larger bound). Among the integer restricted variablesthat have a noninteger valuein the optimal solution for the LP relaxation of the subproblem, choose the first one in the natural ordering of the variables to be the branching variable. Let xj be this variable and x; its value in this solution. Branch from the node for the subproblem to create two new subproblems by adding the respective constraints, xj 5 [x;] and xj 2 [x;] 1 , where [x;] = greatest integer 5 x;.
+
Step 3. Bounding For each new subproblem, obtain its bound by applying the simplex method to its LP relaxation and using the value of the P for the resulting optimal solution. Step 4. Fathoming and Optimality Test For eachnew subproblem, apply the three fathoming tests and discard those subproblems that are fathomed by any of the tests. These are the fundamental steps in the branch-and-bound technique that is applicable to a wide range of mixed integer programming problems. For each subproblems that is created, the LP algorithm can be appliedin the constrained problem in its pure linear form. We now turn our attention to sensitivity methods in linear programming.
VI.SENSlVlTlYMETHODS FOR POSTOPTIMIZATIONIN LINEAR PROGRAMMING In many applications, both nonpower and power system types, we often encounter practical problems. In one instance, we seek the optimal solution and in another, we wish to know what happens when one or more variables are changed. In order to save computational effort, it is desirable not to resolve the problemifsmall perturbationsare made to the variables. Sensitivity analysis is the study used to compute such solutions. Now, recall the linear programming problem given by the following model as shown in equations 5.10 to 5.1 1: I1
Maximize
P(x) =
cfxj
(5.25)
j= I
Subject to the constraints: (5.26) j= 1
Applicationsand Programming Linear
171
and Xj
20
(5.27)
V i e (1,rn)and
V j E (1,n).
Here, we observe that changes in the system can be attributed to the modifications such as
1. 2. 3. 4. 5.
Perturbation in the parameters, bj Perturbation in the cost coefficients, cj Perturbation in the coefficient aii Injection of new constraints Injection of new variables.
We discuss the effect of each case as it applies to sensitivity analysis and further expound on the first class in more detail in a subsequent section. Case a. Perturbation in the Parameters bi Let the optimal basis solution for the problem in its primal form be: (5.28) where
Since the nonbasic variables are zero, then we can write: (5.29) (5.30) where B is an m-square and nonsingular matrixformed by m columns of A. Let b change to b Ab, where Ab = [Ab,, Ab2,.. . , Ab,]', and everthing else in the problem remains the same. Then
+
(5.31) + A XB = B-'(b + Ab), given the new values xB + A XB of the variables that were the original optimal basic variables. If B"(b + Ab) 3 0, then the variables continue to be basic feasible. XB
They would also continueto be optimal if the relative cost coefficients given by CBB-'B continued to be nonnegative; that is,
cB
Chapter 5
172
(5.32)
The optimum value of P(x) changes can be calculated with the new values of the variables given by xB A xB, or by using the following equation,
+
P(x) = CBB"(b
+ Ab).
(5.33)
Case b. Perturbation in the Cost Coefficients ci If Cj are changed to Cf,everything else inthe problem remaining the same, the relative cost coefficients are given by
(5.34) These may not all be nonnegative. For some j, some of them are negative. This would mean that the basic feasible solution that was optimal for Cj is not optimal for Cj'. So from this point onwards further iterations may be done with new values Cj' to obtain a new optimal solution. If, however, Cj' are such that Z = ChB"8 2 0, then the original optimal basis still remains optimal, and the value of the optimal basic variables also remains unchanged. The optimum value of P(x) is given by
ch
P(x) = CAB"b.
(5.35)
Case c. Perturbation in the Coefficient ay If the changes are in aik, where xk is the nonbasic variable of the optimal solution, then we get: (5.36)
where &k) means that the value of aik in B is changed. If CLB"B(ik) 2 0, then the original optimal basis still remains optimal. If not, further iterations with the new values of ChB"&) and aikmay be done.
c;
Llnear Programming and Applications
173
Case d. Injection of New Constraints Generally, if the original optimal solution satisfies the new constraints that are added to the system, then that solution will still be an optimal solution. However, if some of the injected constraints are violated by the original optimal solution, then the problem must besolved taking into account the new constraints to the system. The new initial point may constitute the old basic variables of the original optimalsolution along with one additional basic variable that is associated with each added constraint. Case e. Injection of New Variables Since the number of constraints remains the same, the number of basic variables remains the same. Therefore, the original optimal solution along with zero values of the new variables would result in a basic feasible solution for the new problem. That solution would remain optimal if the newly introduced cost coefficients corresponding to them are nonnegative.
A. Sensitivity Analysis Solution Technique for Changes in Parameters bi We consider the well-known linear programming problem defined by min{P = cT x : s.t. A x = b; X 1. O},
(5.37)
where c and x are n-vectors and b is an m vector. It is assumed that for a given b, the LP problem has been solved with an optimal basis that yields an optimal solution which may be nondegenerate or degenerate. Then the vector b is subject to change with an increment Ab. The postoptimization problem is to further optimize P with respect to A b under the condition that the optimal basis remain unchanged. Now let B and XB be the optimal basis and the associated optimal solution of equation 5.37. Then it is clear from equation 5.37 that both P and X B change as b does. We definethe rateof change of P with respect to b as the sensitivity denoted by
s: = jap&
(a row m-vector).
Thus, the new objective function becomes
(5.38)
Chapter 5
174
P+AP=P+zAb
(5.39)
AP = Sg Ab.
(5.40)
ab
The new optimal solution is X;
= B-l(b
+ Ab) = X B + H A b 2 0 ,
(5.4 1)
where H = B" = [hu]. In practical applications, only some components of b are subject to change and the changes are usuallybounded.If J is the index set that contains j for which bj changes with increment Abj, then
In order to ensure that Abj = 0 is feasible, we impose the condition:
fj ( 0 5 g j f o r j E J .
(5.43)
The trivial condition does not affect the practicality of the postoptimization, yet it guarantees a solution. It can be shownthat the sensitivity S[ is the dual solution of equation 5.37. For this reason, we use the conventional notation (5.44) where y is the dual solution and is a column vector. Theproblem is to minimize AP withrespect to Ab by keeping B unchanged. That is, Minimize
AP =y T A b = C y j A b j
(5.45)
j
Subject to: XBj
+ h,Abj
2O
(5.46)
and
-li; 5 Abj 5 gj wherej E J and i = 1,2,
(5.47) ,m .
Linear Programming and Applications
175
Solution Methodology We wish to utilize the solutionof equation 5.37 to find the sensitivity. Let CB be associated with the optimal basis B and solution XB. That is,
P = CiXB
(5.48)
BXB = b.
(5.49)
and
A new vector is defined here to satisfy (5.50)
Such an m-vector y is unique since B is nonsingular. In view of equations 5.49 and 5.50, we know that both P and XB change if the vector b changes in order to maintain the optimality.But B may or may not change due to the insensitive nature of B to b. We consider here the case for which B remains unchanged when b changes. Thus the vector y of equation 5.41 is constant when b changes. It follows from equations 5.48 through 5.50 that
= y T ab=y a(BxB)
T a- b=
y
TI
= y T.
ab
(5.51)
This result shows that the sensitivityisindeed the y that satisfies equation 5.50. Importantly, it is a constant and, hence, A P as givenby equation 5.40 is exact without any approximation. Since y is also the dual solution of equation 5.37 for the nondegenerate case, it can be obtained together with B. We do not impose the condition of nondegeneracy but only require an optimal basis B which may yield a degenerate (XB 2 0) or nondegenerate ( X B > 0) solution. Inany case, y can be found from equation 5.50 by B" which is also needed in the algorithm. It is intended to solve the problem by changing one component of A b at a time. By looking at the sign ofy j , one may choose a feasible Abj in such a way that
which is equal to zero only when Abj = 0.
Chapter 5
176
A process is employed to change Abj step by step with the resetting of necessary quantities. Inthe process,j E J advances from the first one to the last and then back to the first, and so on, until all Abj = 0 (steady state). Since A P is decreasing from one step to another unless Abj = 0, the steady state is reachable if a solution exists. The method can be implementedby using a series of logical decisions. For Abj with j E J , equation 5.52 requires that hOAbj 2 -XEi where i = 1, 2, . . . ,m and h, is the ith row and jth column of the matrix H.The above inequality can be fulfilled by L, 5 Abj 5 Rj.
(5.53)
The bounds are to be determined as follows. Lj = max[-XBi/hO; It, > 01 = -00 ifall h, 5 0 Rj = min[-XEi/h,; h, c 01 = 00 if all h, 0,
(5.54) (5.55)
where the maximization and minimization are taken over i = 1,2, . . . ,m. (Note that Lj 5 0 and Rj 2 0 are always true as evidenced by X E 2 0.) Chosen according to equation 5.53, Abj is feasible for the constraint equation 5.4 l. Tosatisfy 5.42 and make the smallestyibj < 0, one must select Abj as follows. Abj = 0 if y j = 0 = max[f,, Lj] if yj > 0 = min[g,, Rj] if y j < 0.
(5.56) (5.57) (5.58)
Note that the maximal selection is always negative but the minimal one is always positive. The steady state may be replaced by a simpler expression AP = 0. They are equivalent because (5.59)
which implies that YjAbj = 0 for all j E J .
this is true only when Abj = 0 for all j Abj = 0 when y j = 0.
(5.60) E
J due to the selection rule that
I
Applicationsand Programming Linear
177
Implementation Algorithm The algorithm for sensitivity analysis with changes in the parameter bi can be implemented by using the following steps. Step I . Step Step Step Step
2. 3. 4. 5.
Step 6.
Obtain B, XB, C , and b from the linear programming problem, equation 5.10. Calculate P = C i C B ,H = B" = [/Iv], and then y = HTCB. Identify J and 6, gJ from equation 5.13 of the problem. Set A P = 0. Do the following Steps 6 and 7 for j E J and i = 1,2, .. . ,m 5.1 If = 0, then set Abj = 0. 5.2 If yj > 0, then set Lj = rnax[-XBi/h,; h, > 01 = -00 if all h, 5 0 and set Abj = maxljj, Lj]. 5.3 If yj < 0, then set Rj = min[-XBi/h,; h, 03 = 00 if all h, 2 0 and set Abj = minkj, Rj]. Update AP = AP+yjAbj b, = bj Abj jj - Abj gj = 6 - Abj. Update XBi = x ~ i h,Abj for all i = 1,2, . .. ,m. Update P=P+AP. Go to Step 10 if SP = 0; go back to Step 4 otherwise. Stop with XB, b, and P as a solution for the postoptimization as formulated.
=fi
Step 7. Step 8. Step 9. Step 10.
+
+
Finally, it is easy to understand the updateperformed in Steps 6 and 7 except for the intervals. The new intervals must be shifted by the amount of Abj to the left if Abj > 0 and to the right if Abj < 0. The update in Steps 6 and 7 may be exempted if Abj = 0. Illustrative Example Minimize Subject to:
[P = X ,
+ X2 + X3]
Chapter 5
178
and
It is required to further minimize P for -1 5 Abl 5 1 and 0 5 Ab2 5 2 after a minimized solution has been achieved for the problem. This is a problem of postoptimization and can be solved as follows. Solution Step 1. From the primal problem
Step 2. P = X3 = 1.
H=
[1 -1
Y'Z: l [ -1
1 ] - l = 1 [21 1 1 0 'I[']='[ 2
;'I - 1l ]
Step 3. J = {1,2}. f1=
-1
g1 = 1
f2=0 g2=2.
Step 4. A P = 0. Step 5. With j = 1, then with Step 5.2 with y1 = 0.5, we get
Abl = max[-I, -21 = -1 A P =0
+ (!J(-l)
=
bl = 3 +(-1) = 2
gl = 1 - (-1) = 2.
-4
-
Linear Programming and Applications
179
with Step 5.3, XBl
X,
+ 0.5(-1) = 2 + 0.5(-1) =1
= 0.5 = 1.5.
Step 6. With j = 2, then: y2 = -0.5
R2 = min
(1::;)
AP = -0.5
Step 7.
and Ab2 = min[2, 11 = 1.
+ (-OS)(l)
= -1
b 2 = 1 +g12==22 - 1 = 1
f~=O-l=-l. Step 8.
+ (-OS)(l) = 0 XB2 = 1.5 + (0.5) (1) = 2. P = 1 + (-1) = 0. XBl
= 0.5
Step 9. Step 10. AP = -1 # 0. 10.1 A P = 0. 10.2 j = 1 , y1 = : Abl = max[O, 01 = 0, no update. 10.3 j = 2, y2 = - : Ab2 = min[l, 01 = 0, no update. Therefore, A P = 0. Step 11. The optimal solution to theproblemis:
4
&=[;I
b=[i],
and P = O .
Notably, if we change b2 first and thenbl or J = (2, l}, the solution becomes
.=[!I,
b=[:],
and P = O .
In fact, the problem has an infinite number of solutions:
.=[;I,
b=[;],
and P = O ,
Chapter 5
180
Duality in Post-Optimal Analysis The dualof the primal linear programming problem givenby equations 5.12 to 15.14 can be restated in the form: max{D = bTy s.t. ~~y 5 C ;y 2 01.
(5.61)
Let the vectors in A and components in C be arranged as A = [B,Nl
(5.62)
and
.=[%I.
(5.63)
Then, it follows that (5.64) and (5.65)
Introduce a new m-vector u such that Bu = a k ( A k ) ,
(5.66)
where ak is a vector in N . The vector y in equation 5.62 is feasible for the dual problem if N T y 5 CN.
(5.67)
This is true because (5.68)
To show equation 5.69, we assume the contrary to have
Applicationsand Programming Linear
181
Multiply equation 5.67 by a positive number 0 and then subtract it from equation 5.65:
We obtain from equations 5.66 and 5.67 that
and hence
Multiply equation 5.72 by 6 and then subtract it from equation 5.66:
Choose (5.73) which always exists since X B > 0 (nondegenerate). With 0 so chosen, equation 5.71 demonstrates that anew feasible basis is formed by replacing a, in B with ak in N.On the other hand, equation5.72 indicates that the new P is decreased from the old value by
This contradicts the fact that B is a minimal basis. For any feasible X of equation 5.10 and its dual y , we have
P = C T X 2 (ATy)TX= y T A X = y T b = D.
(5.75)
But, for X = X,, it becomes
P = C ~ X =, ( BTy )T xB= y T ~ x B = YTb = D.
(5.76) (5.77)
Chapter 5
102
Therefore y is also a maximum of the dual problem.
VII. POWERSYSTEMSAPPLICATIONS Consider a subtransmission system in which certain bus voltages of interest form the vector I VI. Assume that the increase in bus voltage magnitude is linearly proportional to injected reactive power at several buses, denoted by the vector AQ. The dimensions of vectors I VI and AQ may not be the same since the capacitor placement may occur at a different number of buses compared to the buses at which I VI is supported. Then (5.78)
where B involves elements from the inverse of the aQ/al VI portion of the Jacobian (under the assumptions of superposition and decoupledpower flow). To ensure high enough bus voltage, (5.79) where I VminI is a vector of minimum bus voltage magnitudes and I VI is the “base case” (i.e., no capacitive compensation) bus voltage profile.The AI VI term stems from capacitive compensation. As a result
Again the concept of a vector inequality in equation 5.68 is said to hold when each scalar row holds. Furthermore, it is desired to minimize cq, cq = C‘AQ.
(5.81)
cr is a row vector of 1s commensurate dimension with AQ. The cost function, cq, is a scalar. The minimization of cq subject to equation 5.68 is accomplished by linear programming. In most linear programming formulations, the inequality constraints are writtenwiththe solution vector appearing on the “smaller than” side (i.e., opposite to inequality equation 5.70), and the index that is extremized is maximized rather than minimized. Both problems are avoided by working with AQ’ rather than AQ, where
AQ’ = K - AQ.
(5.82)
In this discussion, AQ entires are assumed to be positive for shunt capacitive compensation.
Applicationsand Programming Linear
183
It is possible also to introduce capacitor costs, which depend on the size of the unit. This is done by allowing other than unity weighting in c'. Upper limits may also be introduced, but these are usually not needed. VIII. ILLUSTRATIVE EXAMPLES Example 1
A subsystem has two generators. There are four key lines with limits given by PFFx = 12MWPFy
= 12MW
The sensitivity relation between the keylines and the generators is given below:
+
The system benefit function is F = P G ~ 2PG2. Find the maximum benefit value of the system using linear programming.
Solution
If we change the variables to be P G + ~ X I , PG2 + x2 then the problem will be
Chapter 5
184
Maximize
F = x1
+ 2x2
Subject to:
Solution Change the inequality constraints to be equality by adding a slack variable to each inequality constraint. Now the problem will be Maximize
+
F = x1 2x2
Subject to:
3 2 1 0 1 0 0 6 1 1 0 0 1 0
FirstIteration enter the basis
=[B
I B]
k = 2 implies that the secondvector of
8, x2is
to
Linear Programming and Applications
e=
(y }
min - , y i
> 0 =min (12 - 12 18 181 3 9 2 ’ 1 ’ 6 =3,
Y = 4, 8 = 3; therefore the fourth vector basis after replacement X 2 and x6 is
B=
185
x3
x4
x5
1 0 0 0
0 1 0 0
0 0 1 0
x2 3 2 1 6
Components of E ” , r = 4, y, = 6, Yi
--b4j, Yr
is to leave the basis. The new
x1
x6
2
0
6 1
0 1
CB = [l 01.
CB = [O 0 0 21 XB = [ 6 , 6 , 15, 3IT
b, = b,
x6
i
#4 i=4
Chapter 5
1 0.0 0.0 -0.5 0 0 1
0
0
6
Linear Programmlng and Applications
107
Second Iteration 1 0 0.0 -0.5
Z-C,=[l
3
0 1
0
- 51
3 0
0 0
1 - 51
6 0
0
0
;
L
0 0 23
z=CBB%=[o
014-3
'1-[l 3
'2 0'
0
4 3.
Then x1 is to enter the basis 1 0 0.0 -0.5
y = B"& =
;t
8=min -,yi
0 1
0
- 31
0 0
1
- g1
0 0
0
6
I
> O =min(!j
2
18) = 1,
Y = 2, 8 = 1. Therefore, the second column (X4) is to leave the basis. x1 replaces x4,yr = y 2 = 6. The new basis is:
Chapter 5
I88 f3-lcomponents
are
Linear Programming and Applications
1 2 0 3
: : I] .-[g
B = [ :0 1 0 1
C,=[O
189
;]
0 0
1 0 21
Third Iteration 0 0 Z=CJ?B=[O
1 0 21 36
which means that we reached the optimum point,
Chapter 5
190
m i n P = CBxB= [ 0 1 0 21
XI
20
= 1, x2 = (17/6).
Then as a result, PGI = 1 and
PG2= (17/6).
IX. CONCLUSION This chapter covered linear programming, one of the most famous optimization techniques for linear objectives and linear constraints. In Section I the formulation of the natural model associated with the basic assumptions was presented and a graphical solution of the LP problem demonstrated. In Section I1 the simplex algorithm was presented and supported with illustrative problems together with a summary of the computational steps involved in the algorithms. The matrix approach solution to the LP problem was presented inSection I11 where the formulation of the problem and the revised simplex algorithm were also presented. Duality in linear programming was presented in Section TV. For those cases where the variables take either integer or continuous values, mixed integer programming was presented in Section V as a way of solving such problems. The branch-andbound technique for solving this problem was explained. In Section VI the sensitivity method for analysis of postoptimization of LP waspresented supported with a method of solution and detailed algorithms. In Section VI1 a powersystem application was presented where improvement of a voltage profile using reactive power resources installed in the system was shown. The construction of the method and the solution technique were explained. X.
PROBLEMSET
Problem 5.1 Given the objective function
subject to the constraints
Linear Programming and Applications
191
find the point (x) by obtaining the candidate points by finding all possible solutions to be the boundary equations implied by the constraints and testing to satisfy the domain of feasibility. Problem 5.2 Given the objective function
subject to the constraints
find the point (x) that maximizes f. Determine (x) to two decimal places. Problem 5.3 Solve the following linear programming. Maximize
z = 4x1
+ 6x2 + 2x3
Subject to:
xl, x2, x3 are nonnegative integers. Compare the rounded optimal solution and the integer optimal solution.
Problem 5.4 Convert the following problem to standard form and solve.
Chapter 5
192
xI + 4x2 + x3
Maximize Subject to:
+
2x1 - 2x2 x3 = 4 x1 - x3 = 1 x2 1 0, x3 >_ 3.
Problem 5.5 Consider the problem: Maximize
z =x I
+ x2
Subject to:
where X I , x2 are nonnegative integers. Find the optimal noninteger solution graphically. By usinginteger programming, showgraphically the successiveparallel changes in the value that will lead to the optimal integer solution. Problem 5.6 Consider a power system with two generators with cost functions given by: Generator 1: Generator 2:
+ 7.2P1+ 0.00107P: F2(P2) = 119 + 7.2P2 + 0.00072P: FI(Pl) = 80
($)
($1
Where generators 1 and 2 are limited to producing 400 MW and600 MW of power, respectively. Given that the system load is 500 MW, then: 1 . Formulate the problem into linear programming form. 2. Calculate the optimal generation. 3. Determine the optimal generation cost.
Problem 5.7 Consider the problem:
Linear Programming and Applications z = x1
Maximize
193
+ x2
Subject to:
where x],x2 are nonnegative integers. Problem 5.8 Consider the following problem.
Subject to:
+ x2 + 3x3 5 20 12x1+ 4x2+ lox3 5 90 -x1
Xi
2 0, i = 1,2,3.
Conduct sensitivity analysis by investigating each of the following changes in the original model. Test the solution for feasibility and optimality. a. Change in the right-hand constraint 1 to: bl = 30. b.
Change in the right-hand constraint to b2
c.
= 70.
Change in the right-hand sides to
d. Change in the coefficient of ~3
x3
in the objective function to
= 8.
e. Change in the coefficient of x1 to
[::1] [71. a12
=
Chapter 5
194
n] .
f. Change in the coefficient of x2 to
[ [ :a22 f2]
=
g. Introduce a new variable x6 with coefficients
h. i.
+ + +
Introducea new constraint 2xl 3x2 + 5x3 5 50 (denote the slack variables by x 6 ) . Change constraint 2 to 10x1 5x2 10x3 5 100.
Problem 5.9 Consider the following problem. 2 = 2x1
Maximize
+ 7x2 - 3x3
Subject to:
xI 2 0,
i = 1,2,3.
Reformulate the problem using x4 and x5 as slack variables. Conduct sensitivity analysis by investigating each of the following changes in the original model. Test the solution for feasibility and optimality. a. Changein the right-hand sides to
b.Changein
[q a23
the coefficient of x3 to =
Linear Programming and Applications
195
c. Change in the coefficient of x1 to
[i;][41. =
d. Introduce a
[
=
e. Change = XI
new variable x6 with coefficients
[I].
in the objective function
+ 5x2 - 2x3.
f. Introducea new constraint 3x1
g.
+ 2x2 + 3x3 5 25.
Change constraint 2 to
xl + 2x2 + 2x3 5 135. REFERENCES 1. Bialy, H. AnElementaryMethodforTreatingtheCaseofDegeneracyin LinearProgramming, Uternehmensforschung (Germany),Vol.10no.2,
2. 3. 4. 5.
6. 7. 8. 9.
pp. 118-123,116. Bitran,G. R. andNovaes,A.G.LinearProgrammingwithaFractional Objective Function, OperationsResearch, Vol.21no.1(Jan.-Feb.1973),pp. 22-29. Boulding, K. E. and Spivey, W. A. Linear Programming and the Theory of the Firm, Macmillan, New York, 1960. Glover, F. A New Foundation for a Simplified Primal Integral Programming Algorithm, OperationsResearch, Vol.16 no. 4(July-August1968),pp.727740. Harris, M.Y. A Mutual Primal-Dual Linear Programming Algorithm, Naval Research Logistics Quarterly, Vol. 17 no. 2 (June 1970), pp. 199-206. LinearProgrammingMethods, IowaState Heady,E. 0. andCandler,W. College Press, Ames, IA, 1958. Hillier, F. S. and Lieberman,G . J. Zntroduction to Operations Research, 4th ed., Holden-Day, San Francisco, 1986. Lavallee, R. S. The Application of Linear Programming to the Problem of Scheduling Traffic Signals,Operations Research, Vol. 3 no. 4, 1968, pp. 86-100. Luenberger,D. G. Introduction to Linear and Nonlinear Programming, Addison-Wesley, Reading, MA, 1973.
196
Chapter 5
10. Ravi, N. and Wendell, R. E. The Tolerance Approach to Sensitivity Analysis of
Matrix Coefficients in Linear Programming-I, Working Paper 562, Graduate School of Business, University of Pittsburgh, October 1984. 1 1 . Chieh, H. T. AppliedOptimizationTheoryandOptimal Control, FengChia University, 1990.
Chapter 6 Interior Point Methods
1.
INTRODUCTION
Many engineering problems, including the operation of power systems, are concerned with the efficient use of limited resources to meet a specified objective. If these problems can be modeled, they can be converted to an optimization problem of a known objective function subject to given constraints.Most practical systems arenonlinear in nature; however,some approximations are usually tolerable to certain classes of problems. Two methods commonly used are linear and quadraticprogramming. The former solves those problems where both the objective and constraints arelinear in the decision variables. The quadratic optimizationmethod assumes a quadratic objective and linear constraints. The well-known simplex method has been used to solve linear programming problems. In general, it requries burdensome calculations, which hamper the speed of convergence. In an attemptto improve the convergence properties, recent work by Karmarkar on variations of the interior point (IP) method were proposed. The variants include projective, affine-scaling, and path-following methods. Each of these methods solves the linear programming problem by determining the optimal solution from within the feasible interior region of the solution space. Projective methods are known to require O(nL) iterations. They rely on the projective algorithm, which requires a good scheme. Several schemes have been proposed in the literature. These methods are different from the simplex method, which seeks the optimum solution from a corner pointof the solution space. 197
Chapter 6
198
Affine-scaling methods have no known polynomial time complexity, and can require an exponential number of iterations if they are started close to the boundary of the feasible region. Also, it has been shown that these methods can make it difficult to recover dual solutions and prove optimality when there is degeneracy. However, these methods work well in practice. Very recently, a polynomial time bound for a primal-dual affine method has been obtained. Path-following methods generally require O(n0.5L) iterations in the worst case, and work byusingNewton’s method to followthe “central path” of optimal solutions obtained by a family of problems defined by a logarithmic barrier function. The most popularly used scheme so far is the barrier method developed by Medgiddo [13],Kijima et al. [23], and Monteiro and Adler [24]. Theyhaveshown that the algorithm requires O(n3L) overalltime and no interior point algorithm has beenshown to have a better worst-casecomplexity bound. McShane etal.[l 13 have given a detailed implementation of the algorithm. Their results have been recently adapted to power system problems. Howard Universityresearch contractEPRI“RP2436 extends the results of earlier works by improving on the starting and terminating conditions of the interior point method for solving linear programming problems and the extension of the algorithm for solving quadratic-type problems. The optimal power flow (OPF) problem has been recently reviewed asa process of determining the state ofpowersystems that guarantee affordability, reliability, security, and dependability. These abilities optimizegivenobjectives that satisfy a set of physical and operating constraints. In general, these objectives are designated as transmission losses,fuel cost, reactive sources allocation, and voltagefeasibility. In general, OFP is a large-scale nonlinear programming problem with thousands of input variables and non-linear constraints. The problem can be formulated as Minimize
f(z)
Subject to: h(z) = 0 with 1 5 z 5 u. f and h arecontinuously differentiable functions in R” with values inR and Rm,and 1 and u arevectors in R” corresponding to lower and upper bounds in the variables, respectively.
Interior Point Methods
199
II. KARMARKAR’SALGORITHM The new projective scaling algorithm for linear programmingdeveloped by N. Karmarkar has caused quite a stirin the optimizationcommunity partly because the speed advantage gained by this new method (for large problems) is reported to be as much as 50 : 1 when compared to the simplex method [8]. This method has a polynomial bound on worst-case running time that is better than ellipsoid algorithms. Karmarkar’s algorithm significantly is different from George Dantzig’s simplex method [25] that solves a linear programming problem starting with one extreme point along the boundary of the feasible region and skips to abetter neighboring extreme pointalongtheboundary, finally stopping at an optimal extreme point. Karmarkar’s interior point rarely visits very many extreme points before an optimal point is found. The IP method stays in the interior of the polytope and tries to position a current solution as the“center of the universe” in finding a better direction forthe next move. By properly choosing thestep lengths, an optimal solution isachieved afteranumber of iterations.Although this IP approach requires more computational time in finding a moving direction than the traditional simplex method, a better moving direction is achieved resulting in fewer iterations.Therefore,the IP approachhas become a major rival of the simplex method and is attracting attention in the optimization community. Figure 6.1 illustrates how the two methods approach an optimalsolution.In this small problem,the projective scaling algorithm requires approximately the same amount of iterationsasthe simplex method. However, for a large problem, this method only requires a fraction of the number of iterations that the simplex method would require. A major theoretical attraction of the projective scaling method is its superior worst-case running time (or worst-case complexity). Assume that the size ofa problem is definedas the number of bits N required to represent the problem in a computer. If an algorithm’s running time on a computeris never greater than some fixedpowerof N, no matter what problem is solved, the algorithm is said to have polynomial worst-case running time. The new projective scaling method is such an algorithm. Due to the results of [lo, 151, several variants of interior points have been proposed such as the affine-scaling method which is discussed in this chapter. Affine-scaling methods have no known polynomial time complexity, and can requirean exponential number of iterations if they are started close to the boundary of the feasible region. It has also been shown that these methods can make it difficult to recover dual solutions and prove optimality when there is degeneracy. However, these methodsdo work
Chapter 6
200 x3
FIGURE 6.1
Simplex Method
Illustration of IP andsimplexmethods.
well in practice. Very recently, a polynomial time bound for a primal-dual affine method has been obtained [ 1 13. Path-following methods generally require O(n0.5L) iterations in the worst case, and work byusingNewton’smethod to follow the “central path” of optimal solutions obtained by a family of problems defined by a logarithmic barrier function. Two parameters, the barrier parameter and an underestimated optimal value of the objective function, are thelinking parameters between all methods. Barrier methods have been used to construct primal-path-following algorithms, and themethod of centers used as a basis for dual algorithms. After scaling has been used to construct both primal and dual algorithms, other variants of barrier methods have been used to construct primal-dual path-following algorithms and anaffine variant of the primal-dual algorithms. In general, the above methods, whether projective affine method of centers, or path following, are all simple variants of the algorithm barrier methods applied to the primal, dual, or primal and dual problems together [151. As mentionedabove, since Karmarkar’s discovery of the interior point method and its reported speed advantage obtained over other traditionally used methods, many variants of the IP method have evolved in an attempt to solve the above posed problems. Of these the projective scaling, the dual and primal affine methods, and the barrier function method are the most popular. These variants of the IP method are presented and evaluated based on their algorithms and the problems they solve.
Point
Methods
Interior
111.
201
THEPROJECTIONSCALINGMETHOD
The projective scalingalgorithm has attracted a great deal of interest due to Karmarkar's ingenious proof that its running time is a polynomial function of the problem size even inthe worst case. Karmarkar showed that if n is the number of variables in problem 1 and L is the number of bits used to represent numbers in the computer, the theoretical worst-case running time is O(n3*5L2). Thatis, as the problem size increases, the running time tends to be a constant multiple of n3*5L2, which is substantially better than the ellipsoid algorithm's worst-case running time of O(n6L2). The problems solved by the projective algorithm are in the following form. Minimize cTy Subject to: Ay=O
where A is an m by n matrix, and e is a vector of n ones. The main algorithm for the projective method is presented below.
A. Algorithm for the Projection Scaling Method Step 1. Initialization. k = 0, x' = e/n, and let L be a large positive integer. Step 2. Optimality check. IF cTxk is 5 2-LcTe/12, THEN Stop with an optimal solution x* = xk . Otherwise go to Step 3. Step 3. Iterate for a better solution. Let X T = Diag(xk)
~k =
[$1,
e is a matrix.
D is the direction of the real line.
Chapter 6
202
Set k = k + 1. Go to Step 2 Note that in this procedure, the xk is an interior feasible solution, x" is in an n-dimensional diagonal matrix, Bk is the constrained matrix Karmarkar'sstandard form, dk is a feasible direction of the projective negative direction as defined above, yk+l is a new interior feasible solution, and Lis chosen to be the problem size, where 2-L > e, e > 0. This algorithm terminates in O(nL) iterations. A large value of a tends to speed up the iteration.
IV. THEDUALAFFINEALGORITHM Both the primal anddual interior methodsof the variety initiated by Karmarkar canbe viewed as special cases ofthe logarithmic barrier method applied to either the primal or dual problem. The problems solved by the dual and primal affine methods and their algorithms are presented below. Consider the linear programming problem given by Maximize
2 = cTx,
Subject to:
A x 5 b and x is unrestricted. By introducing slack variables into the constraints, the inequality constraints are converted such that we obtain a new formulation given by: Maximize
2 = cTx
Subject to:
Ax+s=b s
A.
2 0 and x is unrestricted.
Algorithm
Step I .
hitialization. Set counter value k = 0 and the tolerance E (a small positive number). Obtain a starting solution (xo,so) such that Axo so = b and so > 0. Set the acceleration constant a,where 0 a < 1.
+
Interior Point Methods
203
Step 2. Obtaining the Translation Direction. Computer 4; = ( A w~~ ~ A ) " cwhere , w k = diag(sk). Compute the direction vector df = -Ad;. Step 3. Check for Unboundness. IF = 0 THEN ( x k ,8)is the dual optimal. Go to Step 9. IF d,k > 0, THEN the problem is unbounded. Go to Step 10. Otherwise, d f < 0. Go to Step 4. Step 4. Compute the Primal Estimatey k ,
d:
yk = - W k ' d f . Step 5. Optimality Test. IF yk 2 0 AND (bTyk- cTxk) 0, x > 0, Ax = b, and q = c - ATp. A correction of n is calculated at each stage since a good estimate is available from theprevious iteration. The main steps of the algorithm are as follows. Step 1. Define D = Diag(xj) and compute r = Dq - p e . Note that r is a residual from the optimality condition for the barrier subproblem, and hence llrll = 0 if x = x*(@. Step 2. Terminate if p and llrll are sufficiently small. Step 3. If appropriate, reduce p and reset r.
Chapter 6
206
Step 4. Solve the least squares problem
Minimize Ilr - DATSnll. 88
Step 5. Compute the updated vectors
n
tn
+ Sn
and q
t
q - ATGn.
Y = Dq - pe (the updated scaled residual) and = -( l / p ) D r . Step 6. Find a ~the , maximum value of a such that x a p 2 0. Step 7. Determine the step length a E (0, a,) at which the barrier func-
Set
+
+
Step 8.
tion F(x a p ) is suitably less. Update x t x a p .
+
All iterates satisfy A x = b and x > 0. The vectors p and q approximate the dual variables n* and reduced cost q* of the original linear program. VII.
EXTENDED INTERIOR POINT METHOD FOR LP PROBLEMS
The linear programming problem is formulated in the standard formula of the LP interior point method form as Maximize
P = CTx
(6.1)
Subject to: AX = b
bi > 0,
(6.2)
i = 1 ,..., m x i > O , i = 1 ,..., nz.
The interior point method involves a sequence that consists of a feasible interior point ( A x = b, x > 0) that makes the objective function increase until it reaches its limit. The limit is an optimal solution of the problem. The interior point method utilizes all the vectors in A together with the points in the sequence to generate a maximumincreaseofthe objective function. In addition to programming simplicity, it is superior to the simplex method in computation time and convergence for large systems (large m > n). The condition that b > 0 as imposed in the simplex method is waived here and the matrix A is only required to be of full rank m < n. Also, by using an appropriate conversion of inequality, two-sided constraints can be handled by the proposed interior point method. Finally, to guarantee existence of the feasible interior points, a trivial condition has been imposed inthat theproblem has noless than two feasible points ( A x = b, x > 0), one of which is a bounded solution for the problem.
Point
interior
207
VIII. FEASIBLEINTERIORSEQUENCE For convenience, the linear programming problem can be written as follows: Maximize
aTx
Subject to:
A x = b, Such that x 2 0. (6.4) Note CT is replaced by aT where a is a column vector to facilitate description of the interior point method. The considered feasible interior (FI) sequence contains only feasible and interior pointsof the problem; that is,
s = {x',x2,..., xk+lk , x
]
(6.5)
with
and
For a known point
2,a diagonal matrix is formed by
k k ,x2, . .. ,x,k J , D = diag Lxl
(6.8)
where
The feasible interior sequence is then generated recursively according to xk+' = xk
+ PDd,,
(6.10)
where d, is an n-vector and f3 is a positive number. They are tobe chosen in such a way that xk+l is a feasible interior point wherever xk is. As such, S contains all FI points of x. The objective functionsandconstraints betweentwo consecutive points are related by
+
aTxk+'= aTxk BdTd,
(6.11)
and Axk+' = Axk where
+ PBd,,
(6.12)
Chapter 6
208
d=Da
(6.13)
B = AD.
(6.14)
and
D is determined by xk and it changes from point to point. Let d be orthogonally decomposed into d = d d,, where d, is the projection on the null space of B and d,is in the B.f-subspace (spanned by the vectors of BT). Then, it follows that
+
Bd, = 0 ,
d, = BTu and dFd, = 0 ,
(6.15)
where v is an m-vector, the coordinates of d,. Solving v and then d,, we have (6.16)
d, = BT(BBT)-lBd
and d, = d - d, = [I - BT(BBT)"B]d = Du,
(6.17)
where U=a-ATw
(6.18)
w = (BBT)-'Bd.
(6.19)
with
To generate the FI sequence S, the LP problem is divided into two cases: trivial and ordinary. The former has the vector a confined in the AT space while the latter does not. In the trivial case, there exists an m-vector v such = constant. Thus, all the feathat a = ATu and hence arx = uTAx = vu'b sible solutions yield the same objective functions and, hence, there is no optimization involvedin the problem. Consequently, it remainsonly to generate S for the ordinary case. Starting with a known FI point XI, S is generated recursively according to equation 6.10 until reaching a point x at which d, = 0. S contains all the points except x, which is referred to as the limit point of S. The problem is assumed to have no less than two feasible points, and between them there exists a bounded solution of the problem. Sincefeasible points form a convex set, there are an infinite number of FI points existing within the problem. For the ordinary casewith a bounded solution, one may draw the following conclusions: (1) S contains an infinite number of points and (2) S always has a limit point. To show this, let us assume d, = 0 at a finite k; then d = d, d, = d, reveals that
+
Point
Interior
209
d = Da = BT(BBT)" B d = DAT(BBT)"Bd = DATv
from which we have a = A T v since D is nonsingular. This is a trivial case and, hence, dp can not be zero for finite k. It follows from equation 6.1 1 and dcd, = 0 that aTxk+l = aTxk /311d,112. (6.20)
+
Summing both sides of equation 6.20 from k = 1 to k = N gives N
aTxN+l = a T x l
+ x/311dpl12.
(6.21)
k=l
To make equation 6.21 bounded, it is necessary that d, = 0 as N + 00. Otherwise, the right side becomes positively unbounded and, hence, xN+' (feasible) yieldsan objective function, which contradicts the assumptionof a bounded solution. Now, it remains to specify p to make xk+l a FI point if A! is one. We choose for this purpose
1 (6.22) -Y' where y is the smallest component of dp. It is asserted that y < 0 for all points in S. Indeed, if y 2 0 then d, 2 0 and xk+* > 0 for any p > 0 according to equation 6.10. The objective function indicated by equation 6.20 becomes positively unbounded together with p since d, # 0 in S. This contradicts the assumption of a bounded solution and, hence, y c 0 must be true. If xk is a FI point, then the ith componentof equations 6.10 and 6.12 becomes = $ + /3$d,, = $( 1 + /3 4,)>_ $( 1 + By) > $( 1 - 1) = 0, for all i = 1,2, . . .,n and AA!" = A 2 + 0 = b. Since 2" is also a FI point, all the points of S are FI points if x 1 isby induction. Figure 6.2 shows the interior point method algorithm. 0 $ > o for any K, xi = lim $+' > o k-mo
(6.27)
as K approaches infinity whichviolates the fact that xi = 0; therefore, ui 5 0 or U 5 0 must be true. Using u 5 0 and d, = 0 at x, we obtain uTy 5 0 and d i e = 0, where y is any feasible solution and e is the n-vector with all components equal to one. Substituting equations 6.17 and 6.18 yields uTy - dFe = uTy - uTDe = uT(y - x) = (aT - w ~ A-)x) ~
= a Ty - a T x - w T b + w T b
(6.28)
= a Ty - a T x10, which indicates that x is a maximum solution. IX. EXTENDEDQUADRATICPROGRAMMINGUSING INTERIOR POINT (EQIP) METHOD Extended quadratic programming using the interior point method (EQIP) considered here is an extension of the linear programming version of the interior point method developed during the project. The objective function, a quadratic form, is defined by
1 P = ;xTQx
+ aTx
(6.29)
subject to A x = b and x ? 0,
(6.30)
where Q is any square and symmetric matrix. Linear programming is a special case of the quadratic programming problem when Q = 0. The concept for solving the quadratic programming problem is similar to that of linear programming problems. It is again assumed that the problem has a
Chapter 6
212
bounded solution. With A being of fullrank mwith m < n, there are at least two feasible solutions to the problem. The same FI sequence generated below guarantees optimality within the feasible region for quadratic optimization problems. In order to maintain the solution of the problem of each iteration within the interior feasible region, the algorithm requires the calcFlation of the initial starting interior feasible point 2'; that is, AR' = b with 2; 2 0. The initial feasible point can be obtained by introducing the artificial variable xs. the EQIP obtains the initial feasible point by using an auxiliary problem. [-xs]
1)
Maximize Subject to:
Ai7: + (6 - Ae)xs = 6, jTj
j = n + 1,..., n + 2 m ,
20,
(6.32)
xs 2 0 .
Clearly, any feasible solution of the original problem is a maximum solution xs = 0 for the auxiliary problem and vice versa. Since the latter always has a feasible point at
-
X=e
and x, = I ,
one may use this point as the initial starting point to solve the auxiliary problem by the EQIP with Q = 0 to reach a maximum solution and thus obtain a feasible initial point for the original problem. The key point is that the direction vector d, at each iteration k can be approximately calculated but maintain feasibility of
kk+1
9
for example,
Ajp+I = 6 with
,
2k+l 2 0 ,
j+n+l,
..., n+2m.
A feasible direction, along with the objective function increases, is found, and then an approximate step length is determined to guarantee the new feasible solution, which is strictly better than the previous one. The stopping criteria are the relative changes in the objectivefunction at iterations; that is, -PkI/max{l, Ipkl) < E l , (6.33) or the relative changes in interior feasible solutions in iterations; that is, IPk+l
thods
Point
Interior
213
lak+'- j i k l
(6.34)
< E,
(6.35) (6.36) The optimality condition is computed until the maximum is satisfied.
(6.37) A.
Detailed EQIP Algorithm
A detailed step-by-step description of how the EQIP algorithm solves a quadratic objective function subject to linear constraints is presented below. Step 1. Identify the problem defined by maximizing P = ixTQx Subject to:
Ax = b with xi
+ aTx
2 0, for i = { 1, n}.
Step 2. Set xi = 1 for all i. x,+~ = 1 is an FI point of the auxiliary problem of maximizing Xn+1 Subject to:
Ax + (b - Ae)xn+l = b, where
and close vector
+
+
is an FI point because Ax (b - Ae)x,+l = Ae b - Ae = b. Evidently, x = e is an FI point for theoriginal problem when c = b - Ae = 0 since Ae = b. Step 3. Construct updated B = AD for corrected x which is modified in such a way as to maximize (-x,+~) in the auxiliary problem. At the maximum, -xn+1 = 0 and, hence,
+
+
AX (b - Ae)x,+l = b + AX 0 = b. Thus, x is an FI point of the original problem.
Chapter 6
214
Step 4. The MN solution for By = n is designed to obtain:
y =B~(BB~)~u, where BBT is nonsingular. The reason to assume n = (x,+l)C is
to ensure feasibility as explained in the next step. Step 5 . y = min[yi1,
X!+'
= xk(1
for all i and !:x:
+ pyi)
= xi+1(1 - p).
For the auxiliary problem, we have u = (b - Ae)x;+'
Axk+'
+ (b - Ae)xkff
+ boy) + (bl)(b- Ae)x;f,' + 0 = b = Axk + v + P[By = Axk + (b - Ae)x$' + 0 = b. = A(xk
V]
Since By = n, xk+l is feasible if xk is. The objectivefunction increases by: (-Xk+l n+l) - (-Xn+l) k = PXn+l k >0 because b > 0 is to be chosen.
X. ILLUSTRATIVEEXAMPLES A.
Example 1
Solve the constrained problem using the following. Maximize
z =xI
+ 2x2
Subject to: x1
+ x2 + x3 5 8
xj 2 0.
1.
2.
Interiorpointmethod. Graphical representation.
Based on the algorithm shown in Section VI and following the flowchart in Figure 6.2, we can say Z = .x1 2x2 = C'x + C' = [l 2 01 as x = [x1,x2,x3]' Subject to:
+
Interior Point Methods
Ax=b+A=[l
215
11.
1
We are going to take a = 0.7, E = 0.1. First Iteration As an initial point, we st.art byx = [l , 1,2]'. Substitlute in the objective function,
I:[
2 01
Z=C'x=[l
D = diag(x) =
1
=3.0
[:! '1 0
0
2
1 0 0 0 1 0 0 0 0.5
0 0 2
1
j=AD=[l
rl 13 0
lo
][r]=[i]
o 01 1 0 =[1 O 21
1
21.
The projection area p, 0.833
P = I - i'(IA')"i = -0.1667 0.333 -0.333 -0.333
C,=PC=
[e::]. -1
Then we can get the value of y = 1 .
-0.1667
-0.333 0.833 -0.333
Chapter 6
I:;[
= 5.45
5.45 - 3.0 = 2.4 Then we go to the second iteration. Second Iteration X
= [I .35,2.05,0.6]'
D = diag(x) =
['y
[ ,"
1.35
3 = D"x =
2.i5 0!6]
-I
0 2:s
0!6]
(1/1.35)
0
0
0
(1/0.6)
-
[: 1.35
1
A=AD=[l
1]
0 2:s
0!6]
= [ 1.35
2.05 0.61.
The projection area p,
P =I
-
A
- it(
C=DC=
' Y 5 5:2. 0 0
0.7146 -0.4334 -0.1269 -0.4334 0.3418 -0.1926 -0.1269 -0.1926 0.9436
:][;I
0.6
=
[i;;]
1
Interior Point Methods
217
-0.8 124
'
-0.96 1 1 Then we get the value of y = 0.961 1 . P e w
+ @Cp
- :Old
=
[
0.4083 1.5945 0.3
[
0.5512
xnew= D?"'" =
r::]
ZneW = C'xnew[ 1 2 01 3.2688
Aobjective = Znew- P
I d
= 7.0888
= 7.0888 - 5.45 = 1.6388 > E .
Then we go to the third iteration. Third Iteration x=
[
0.55 12 3;;;8]
xnew=
[
0.1654 3.73831 0.0963
y = 0.5328
Z = 7.6421
Aobjective = Znew- Pld= 7.6421 - 7.0888 = 0.5533 > E . Then we go to the fourth iteration. Fourth Iteration X
=
0.1654 3.7383 0.09631
[
xnew=
[ ] 0.0661 3.905 0.0289
y = 0.1923
z = 7.9577
Chapter 6
218
Aobjective = ZneW - Zold= 7.9577 - 7.6421 = 0.0816 < E . Then we can stop here with a result xI = 0.0661,
x2 = 3.9689,
x3 = 0.01 12,
and 2 = 7.9577.
B. Example 2 Consider the following problem Maximize
z = 3x1
+ x2
Subject to: X]
+x2 5 4
xi' 2 0.
Starting from the initial point (1,2) solve the problem usingthe interior point algorithm. Based on the algorithm shown in Section VI and following the flowchart in Figure 6.2, we can say 2 = 3x1 x2 = C'x + C' = [3 13 as x =
+
[XI x21' 9
Subject to: Ax = b + A = [I We are going to take
a! = 0.7, E
13.
= 0.01.
First Iteration As an initial point, we start with x = [I, 21'. Substitute in the objective function
Z = C'n:=[3 I]
I:[
D = diag(x) =
[:, ;]
i==-lx=
[:,I:['-];
= 5.0
1 = [o
The projection area p
- -' - 1 A- =
P = r - &AA
-:;I
[ -:::
0
os][:]
=
[ :]
interior Point Methods
219
1 0 3 2][ 1 1 =
e = D c =[o
[3]
[-:::]-
Cp=P&
Then we get the value of y = 0.8.
+ @CP
pew
= zold
Xnew
- DZnew =
=
[;:I
[;::]
]
ZneW = Ctxnew = [ 3 1 ][ 2 4 = 7.8 0.6 *
Aobjective = Znew - 2?ld = 7.8 - 5.0 = 2.8 > E . Then we go to the second iteration.
Second Iteration X
= [2.4, .06]'
D = diag(x) =
[2.40 0.60 ]"[:::] A = A D = [ l 1 1 [';p :6]
z=D"x=
The projection area p p =I -l
f ( A A y A=
=
[ :]
= [ 2.4 0.61.
[ -0.2353
0.0588 -0.23531 0.9412
e=DC=[i:i]
cp=pe= Then we get the value of y = 1.1294.
220 pew
= E . Then we go to the third iteration. Third Iteration X
= [2.9929,0.0071]'
D = diag(x) =
[2'9i29 0.0071 O I
2 = D"x =
A = A D = [ 2.9929 0.0071 1. The projection area p, p =I -J t ( i i t ) - q
L
e)
Then we get the value of y = 0.0142.
c
[0.2980 l m o o l '1
pew
= zold
xnew
= DZnew= 2.9979
+
P -
[0.0021 ]
interior Point Methods
221
Aobjective = Znew- 2?ld = 8.9958 - 8.9859 = 0.0099 -c 6. Then, we stop up to the third iteration x2 = 0.0021,
x1 = 2.9979,
and 2 = 8.9958.
C. Example 3 The following optimization problem demonstrates the primal affine scaling algorithm. Minimize
Z = 2x1 + x2 + 4x3
Subject to:
+ x2 + 2x3 = 3 2x1 + x2 + 3x3 = 5 x1
xi 2 0 (i = 1,2,3). First Iteration Let
e=
[i ] ,
xo=
[
K],
1.5 0.0 0.0
Do = diag(xO)= 0.0 0.0 0.5 Therefore, the dual estimate vector is
and the reduced cost coefficient is
ro =
[
-0.0526 -0.4737 0.4737]
.
For the optimality check, we calculate
Chapter 6
222
eT
=Dora = -0.0789
and
4=
The optimality condition is not satisfied but the problem is not unbounded; therefore
Bo = 4.1807. The update on the primal variable is: X' ='X
+ BoDod!
=
Second Iteration
1.995 1 0.0000 0.0000 0.0000 0.9951 0.0000 0.0000 0.0000 0.0049 w1 =
[
and r' =
[81.
I
For the optimality checking, we compute eTDlrl = 0.0049
dj =
[
0.0 0.0 -0.0049
1,
which implies that the problem is bounded. Therefore, we compute = 202.0408 and update the primal variable to get x* =
[;:I
=x*,
which yields an optimal value of objective value Z* = 5. D. Example 4 The following optimization problem demonstrates the dual affine scaling algorithm.
interior Point Methods
223
+
Z = 1 5x1 1 5x2
Maximize Subject to:
+ - 1 . 5 ~ 1 + 1.5X2 + 5x2 = 1.5 1 . 5 ~ 1 5x1 = -3.0
1 . 5 ~ 3+ s j = 0 1.5X4
+ S4 = O
si 2 0 (i = 1,2,3,4).
First Iteration
L
J
1.5
0
0
0
0
0
Wo = diag(sO)= 4.5
Therefore, the direction of translation is
L
4 = -AT&
=
-35.2982' -16.7202 -35.2982 -52.0 183
15.6881
For the optimality check, we calculate bTyo - cTXO = 46.2385.
The optimality condition is not satisifed but the problemis not unbounded; therefore Po = 0.0421.
Chapter 6
224
The update on the primal variable is:
1
s =
Second Iteration
r -0.0043 di = - A dx =
' I
=
IL
*
-5.6714 -0.0043 -5.6757,
[ 1.06221 18.9374 8.9378 0.0005
*
For the optimality check, we calculate bTyl - Z X ' = 0.3918.
The optimality condition is not satisfied butthe unbounded; therefore
problemis
not
Po = 0.1391. The update on the primal variable is:
0.0144
1.5224 The reader may carry out more iterations and verify that the optimal value is assumed at X* = (-2, -l)T and s* = (0, 0,3, 1.5)T.
Interior Point Methods
XI.
225
CONCLUSIONS
Variants of interior point algorithms were presented. These variants includedworkby Karmarkar, projection, ofline-scaling and the primal affine algorithm. These methods were shown in Sections I1 through V. In Section VI the barrier algorithm was presented, where a barrier function tests hit inequality constraint methods by creating a barrier function whichis a combination of the original objective function and a weighted sum of functions with a positive singularity at the boundary. The formulation and algorithm were presented in this section. In Section VI1 an extended interior point for the LP problem was presented and a discussion of the possible interior sequence was presented in Section VI11 where the optimality conditions and start andtermination of the recursive process were explained. In Section IX an extended quadratic programming algorithm for solving quadratic optimization problems was presented.
XII.
PROBLEM SET
Problem 6.1 Solve the unconstrained problem: Minimize
1 2 Z=gX1
1
2
+ 3 x 2 -X1X2 -2x,,
1. using the interior point method, and 2.
anyother method.
Problem 6.2 Solve the following problem using the quadratic interior point method. Minimize
z = 2x:
+ 3x; + 5x5 + x1+ 2x2 - 3x3
Subject to:
+ x2 = 5 x1 + x3 = 10 x1
xj 2 0.
Problem 6.3 Consider the following problem. Maximize
z = 2x1
+ 5x2 + 7x3
Chapter 6
226
Subject to: XI
+ 2x2 + 3x3 = 6
xj 2 0.
1. Graph the feasibleregion. 2. Find the gradient oftheobjective function and thenfind the projected gradient onto the feasible region. 3. Starting from initial trial solution (1, 1, 1) perform two iterations of the interior point algorithm. 4. Perform eight additional iterations.
Problem 6.4 Consider the following problem. Maximize
z = - x 1 - x2
Subject to: XI
+
x2
I3
“XI xj
x2
58
+ x2 -< 2
2 0.
I. 2. 3. 4.
Solve this problem graphically. Use the dual simplex method to solve this problem. Trace graphically the path taken by the dual simplex method. Solve this problem using the interior point algorithm.
REFERENCES Alder, I., Karmarkar, N., Resende, M. G . C., and Veiga, G . An Implementation of Karmarkar’s Algorithm for Linear Programming, Working Paper, Operations Research Center, University of California, Berkeley, 1986 (also in Matlzematical Programming, 44). Alder, I., Resende, M. G. C., Veiga, G., and Karmarkar, N. An Implementation of Karmarkar’s Algorithm, ORSA Journal onComputing, Vol. 1, no. 2 (Spring 1989), pp. 84-106. Anstreicher, K. M. A Monotonic Projective Algorithm for Fractional Linear Programming, Algorithmica, Vol. 1, 1986, pp. 483-498. Barnes, E. R. A Variation of Karmarkar’s Algorithm for Solving Linear Programming Problems, MathematicalProgramming 1986 .for Computing Projections, 13th International Mathematical Programming Symposium, Tokyo (August) 1988.
Point
interior
227
5. Carpenter, J. Contribution a 1’Etude du Dispatch Economique, Bulletin de la Sociitk Francaise des Electriciens, Vol. 3 (August 1962), pp. 431-447. 6. Dommel,H. W. and Tinney, W. F. Optimal Power FlowSolutions, IEEE Transactions on Power Apparatus and Systems, Vol. 87, 1968, pp. 1866-1878. 7. Galliana, F. D. andHunneault,M.A Survey of theOptimal Power Flow Literature, IEEE Transactions on Power Systems, Vol. 6 (August 1991),pp. 1099-. 8. Karmarkar, N.New Polynomial-Time AIgorithm forLinearProgramming, Combinatorica, Vol. 4, 1984, pp. 373-397. 9. Karmarkar, N. and Ramakrishnan, K. G. Implementation and Computational Results of the Karmarkar AlgorithmforLinearProgramming, Using an Iterative Method. 10. Kojima, M. DeterminingBasic Variables of Optimal Solutions In Karmarkar’s New LP Algorithm, Algorithmica, Vol. 1, 1986, pp. 499-517. 1 1 . Kozlov, The Karmarkar Algorithm: Is It for Real? SIAM News, Vol. 18, no. 6, 1987, pp. 1-4. 12. McShane, K. A., Monma, C. L., and Shanno, D. An Implementation of a Primal-Dual Interior Point Method for Linear Programming, Report No. RRR #24-88. 13. Meggido, N. On Finding Primal- and Dual-Optimal Bases, Research Report, RJ 6328 (61997), IBM, Yorktown Heights, New York, 1988. 14. Momoh, J. A. Application of Quadratic Interior Point Algorithm to Optimal Power Flow, EPRI Final Report RP 2473-36 11, March, 1992. 15. Momoh, J. A., Austin, R., and Adapa, R. Feasiblity of Interior Point Method for VAR Planning, accepted for publication in IEEE SMC, 1993. 16. Momoh, J. A., Guo, S . X.,Ogbuobiri, E. C., and Adapa, R. The Quadratic Interior Point Method Solving Power System Security-Constrained Optimization Problems, Paper No. 93, SM 4T7-BC, Canada, July 18-22, 1993. 17. Ponnambalam, K.New Starting and Stopping Procedures for the Dual Affine Method, Working Paper, Dept. of Civil Engineering, University of Waterloo, 1988. Dispatch Using Quadratic 18. Reid, G . F. and Hasdorf, L. Economic Programming, IEEETransactions on PowerApparatusand Systems, Vol. PAS-92, 1973, pp, 201 7-2023. 19. Sun, D. I., Ashely, €3.B., Hughes, A., and Tinney, W. F., Optimal Power Flow by Newton Method, IEEE Transactions on Apparatus and Systems, Vol. PAS103, 1984, pp. 2864-2880. S., and Freedman, B. A. A Modification of 20. Vanderbei, R. J., Meketon, M. Karmarkar’s Linear Programming Algorithm, Afgorithmica, Vol. 1, 1986, pp. 3997-. for Solving the 21. Vannelli, A. An Adaptation of theInteriorPointMethod Global Routine Problem, ZEEE Transactions on Computer-Aided Design, Vol. 10, no. 2 (Feb. 1991), pp. 193-203. 22. Ye, Y. and Kojima, M. Recovering Optimal Dual Solutions in Karmarkar’s Mathematical Polynomial-Time Algorithm Linear for Programming, Programming, Vol. 39, 1987, pp. 307-31 8.
220
Chapter 6
23. Kojima, M. Determining Basic Variables of Optimal Solutions in Karmarkar’s New LP Algorithm, Alguritlzmica, Voi. 00, 1986, pp. 449-517. 24. Montiero, R. C. and Adler, I. Interior path following primal-dual algorithms. Part I: LinearProgramming, MathematicalPrugrammirzg, Vol. 44, 1989,pp. 27-42. 25. Dantzig, G . F. Linear Programming and Extensions,Princeton University Press, Princeton, NJ, 1963.
Chapter 7 Nonlinear Programming
1.
INTRODUCTION
While linear programming has found numerous practical applications, the assumptions of proportionality, additivity, and other formsof nonlinearity are common in many engineering applications. Most often the sources of nonlinearity are the physical process and the associated engineering principles that are not amenable to linearization. Even though there are linearization schemes,they are subject to large errors in representing the phenomenon. Nonlinear programming (NLP) aims to solve optimization problems involving a nonlinear objective and constraint functions. The constraints may consist of equality and/or inequality forms. The inequalities may be specified by two bounds: bounded below and bounded above. There is no generalized approach to solve the NLP problem and a particular algorithm is usually employedto solve the specified type of problem. In other words, it is different from the simplex method which can be applied to any LP problem. However, two methods, namely, the sensitivity and barrier methods, are considered to be quite generalized to be able to successfully solve the NLP. These methods are discussed in detail in this chapter. Theorems on necessary and sufficient conditions are given for extremizing unconstrained functions and optimizing constrained functions. In conjunction with the necessary condition, the well-known Kuhn-Tucker (K-T) conditions are treated. Finally, based on the K-T conditions, sensitivity and barrier methods aredeveloped for a general approach to solve the NLP problems. The methods are designed for solving NLP involving large numbers of variables such as the power system. 229
Chapter 7
230
II. CLASSIFICATION OF NLPPROBLEMS A.
NLP Problems with Nonlinear Objective Function and Linear Constraints
This is a relativelysimpleproblemwithnonlinearitylimited to only the objective function. The search space is similar to that of the linear programming problem and the solution methods are developed as extensions to the simplex method. B. QuadraticProgramming(QP)
This is a special case ofthe former where the objective function is quadratic (i.e., involving the square or cross-product of one or more variables). Many algorithms havebeendevelopedwith the additional assumption that the objective function isconvex,whichis a direct extensionofthesimplex method. Apart from being a very common form for many important problems, quadratic programming is also very important because many of the problemsinSection A are oftensolved as a seriesof QP or sequential quadratic programming (SQP) problems. C. ConvexProgramming
Convex programming arises out of the assumptions ofconvexityofthe objective and constraint functions. Under these assumptions, it can encompass both foregoing problems. The majorpoint of emphasis isthat the local optimal point is necessarily for the global optimum under these assumptions. D. SeparableProgramming Separable programming is a special class of convex programming with the additional assumption that all objective and constraint functions are separable functions; that is, the function can be expressed as a sum of the functions of the individual variables. For example, iff@) is a separable function it can be expressed as
j= I
where eachf,(zcj) includes a term involving xj only.
amming
Nonlinear
111.
231
SENSITIVITY METHOD FOR SOLVINGNLP VARIABLES
For simplicity, we hereafter use i 5 p to mean i = 1,2, . .. ,p , i > p to mean i = p + l , p + 2 ,..., m, and all i to mean i = 1 , 2,..., m. Let y be an nvector and f(y ) together with f;.(y ) be scalar functions for all i. Then, the NLP problem is defined to Minimize
f (y )
Subject to: Ci
sf;.( y ) 5 Di
(7.1) for all i,
where Ci = DI
for i 5 p
but
Ci c D j for i > p .
There are p equality constraints and m - p inequality constraints which are bounded by Ciand Difor i > p . The number of p is less than or equal to n but m may be greater than n. By Ci = -00 and D i= 00,we mean that the inequality constraint is bounded above and below, respectively. Any constraint with one boundis thus a special case of the inequality constraint. The constraints are denotedby the equality form
L(Y)= Ki
(7.2)
for all i where Ki= Di, when i 5 p and Ci5 Ki5 Di, when i > p . In matrix form, they are denoted collectively by F(y)=K ,
(7.3)
where F ( y ) = [J;.(y),f2(y),* * * ,h,(U)IT and K = [K,K2, - * KnlT. The Lagrange function for this problem is defined as before by the scalar function 9
U Y , A) =f(Y) + A F ( y ) ,
9
(7.4)
where A = [A,,A.2, . ..,Am]. The Lagrange function is assumed to be continuous up to the first partial derivatives at y = x which is a minimum as considered in the theorem. Extended Kuhn-Tucker conditions are considered here to cover constraints bounded below and above. The conditions canbe stated in theorem form as follows.
Theorem If x is a solution .for the NLP problem, then it is necessary that L,(x, A) = 0 and one of the following conditions be satisfied for all i > p .
Chapter 7
232 ( a ) hi = 0 when Ci < f i ( x ) < Dim ( b ) hi 2 0 w/zenfi(x)= Di. ( c ) hi 5 0 whenL(x) = Ci.
A.
Procedure for Solving the NLP Problem
1. Use L,(x, A) = 0 and the equality constraints to find x for Cases ( 4 , ( N , and (4. 2. Find the smallest f ( x ) among the threepossible x obtained in Step 1. 3. Use T = L,(x, A) + /I I;;T(x)F,(x) > o for some /I 2 o to test sufficiency for the x determined in Step 2. The conditions as imposed in the theorem are calledhere the extended Kuhn-Tucker (EKT) conditions. Theysuggest that one can predict the changes of f ( x ) due to variations of K if h is known. This fact is utilized in the method to approach the EKT conditions. Note that there may exist multiple sets of EKT conditions. Transpose equation 7.4 and then use the column vectorz to denote AT and U to denote Lf: U(X,z ) =f;rT(X)
+ F,T(x)z = 0,
(7.5)
where zi = hi for all i. Since equation 7.3 must be satisfiedfor y = x , we have F(x) = K ,
(7.6)
where K j = Ci = Di for i 5 p but is uncertain for i > p . The method uses a process that adjusts Ki in Ci 5 Ki 5 Di for i > p to decrease f ( x ) and keep equations 7.4 and 7.6 satisfied. Consider some x , z, and K that satisfy equations 7.5 and 7.6 at one step and x Ax, z Az, and K A K at the next. Then, it follows that
+ + + U ( X+ A X ,z + A z ) =f,T(X + AX) + F:(x + Ax)(z + A z ) = 0 and F(x + A x ) = K + A K . The first order approximation of the above equations are U ( X ,Z )
+ U,(X, Z)AX + UZ(x, Z)AZ
=0
(7.7)
and F(x)
+ F,(x)Ax = K + A K .
(7.8)
The partial derivatives of V ( x ,z ) can be found from equation 7.5 as
i= 1
Nonlinear Programmlng
233
and W X , 2) = F%),
wheref,, andf,, are, respectively, the second derivatives off($ and J;:(x) with respect to x; all of them are assumed to exist. For simplicity, the augments x and z are omitted here and equations 7.7 and 7.8 are combined in a matrix form Ay = b,
(7.10)
where A=
:[
Note that the condition U = 0 may not be true due to first-order approximation but F = K is true since K is calculated from equation7.6. Inclusion of U in the vector b would force U to be zero at the next step if it were not zero at the present one. Equation 7.10 shows that anytwo of the increments may be determined if the third one is specified. However, the change of K not only relates to the constraints but correlateswith the objective function as evidenced by equation 7.6. It is for thisreason that A K is chosento be the independent variable. As mentioned earlier, thebasic rule of adjusting AK is to decreasef ( x ) without violating the constraints. The matrixA of equation 7.10 may not be inverted at each step of the process; this is always the case for m > n. We are seeking here the least square solutionwith minimum norm (LSMN) forA x and Az. The solution always exists and is unique as long as A is not a null matrix. Moreover, it reduces automatically to the exact solution if A is nonsingular. B. Alternative Expression for EKT Conditions
As it is, the EKT conditions are not suitable for application to a computerbased solution. An alternative expression is sought here for practical applications. Consider an m-vector J with component Ji(i = 1,2, . .. ,m) defined by T = minLhi, (Di - Kill
(7.1 1)
Ji = max[T , (Ci- Ki)],
(7.12)
and for all i = 1,2, . ..,m. It can be concluded that a set of EKT conditions is satisfied if and only if U = 0 and J = 0. This can be shown as follows.
Chapter 7
234
1. FeasibiZity Ki > Di : Ji = 0 from equation 7.12 requires that T = 0 which can not take place in equation 7.1 1 since Di - Ki 0. Ki < Ci : Ji = 0 from equation 7.12 can not occur since Ci - Ki > 0. Therefore, Ci 5 Ki 5 Di must hold when Ji = 0. 2. Optimality U = 0 is imposed in the sensitivity method. This fulfills the first part of the EKT conditions: L, = 0. Ci Ki < Di : Ji = 0 from equation 7.12 requires that T = 0 and hence hi = 0 must hold in equation 7.11. Conversely, if hi = 0, then T = 0 in equation 7.11 and hence Ji = 0 results from equation 7.12. Ki = Di : Ji = 0 from equation 7.12 requires that T = 0 and hence hi 2 0 must hold in equation 7.1 1. Conversely, if hi 2 0, then T = 0 in equation 7. l l and hence Ji = 0 results from equation 7.12. Ki = Ci : Ji = 0 from equation 7.12 requires that T 5 0 and hence hi 5 0 must hold in equation 7.1 1. Conversely, if hi 5 0, then T 5 0 in equation 7.1 1 and hence Ji = 0 results from equation 7.12. It follows therefore that J = 0 and U = 0 are both necessary and sufficient to reach a set of EKT conditions. These conditions are used as the criteria for termination of the method. The process involvedin the method may start with any guessed values of x and h. However, different initial values may lead to different sets of EKT conditions and even divergence. 1. Adjustment of A K
Consider a change A x about a known x. The second-order approximation of the objective function can be written as f(x
+ A X ) = f ( x ) + C ~ ~ A+X 3 A X ~ C ~ ~ ~ A X ,
(7.13)
where the partial derivatives are evaluated at x. It is intended to reduce equation 7.13 by a proper adjustment of A K . To this end, we assume that A K = q J , where J is determined from equations 7.1 1 and 7.12. The increments A x and A z caused by A K satisfy equation 7.10: (7.14) which consists of SAX
+ F:AX
= -u
and FxAx = A K = q J . We define vectors u and v in such a way that A x = qu and FT(Az qv) = (q - I ) U where the last equation is satisfied in the sense of LSMN [13]. Then, by eliminating A x and Az, we obtain from equation 7.14 that
Nonlinear Programming
235
(7.15) Substitution of A x = qu into equation 7.13 gives
f(x
+ A X )=f ( x ) + 4fxu + J q 2 N ,
where N = UTfxxu.
The minimum off ( x
(7.16)
+ A x ) for positive q occurs at
q=--f x u N
(7.17)
if Jvu e 0 and N > 0. We choose q as given by equation 7.17 only when f x u e 0 and N > 0, and choose A K = J or 4 = 1 otherwise. For q not equal to one, AKi is revised for all i, by
T = min[qJi,(Di - Kj)]
(7.18)
A Ki = max[T , (Ci - &)I.
(7.19)
Using the A K , we solve Ax and Az from equation 7.10 and then update x and z.
IV. ALGORITHM FOR QUADRATICOPTIMIZATION Since it is of practical importance, particular attention is given to the NLP that has the objective functionandconstraints described by quadratic forms. This type of problem is referred to as quadratic optimization. The special case wherethe constraints areof linear forms is known as quadratic programming. Derivation of the sensitivity method is aimed at solving the NLP on the computer.An algorithm is generated for this purpose according to the result obtained in the previous section. Quadratic optimization is involved in power systemsfor maintaining a desirable voltage profile, maximizing power flow, and minimizing generation cost. These quantitiesare controlled by complexpower generation whichisusually bounded by two limits. The first two problems can be formulated with an objective function in aquadraticform of voltage while the last one is in a quadratic form of real power. Formulation of the first problem is given in the last of the illustrative problems. As usual, we consider only minimization since maximization can be achieved by changing the sign of the objective function. Let the objective and constraint functions be expressed, respectively, by
Chapter 7
236 f (x) = qxTRx
+ aTx
and
+brx
J;(x) =
for all i.
(7.20)
R together with H , are n-square and symmetrical matrices, and x, a together with b, are n-vectors. The quadratic functions are now characterized by the matrices and vectors. As defined before, the constraints are bounded by Ci sA(x) 5 Di, for all i = 1,2, . . . ,In. Among these i, the first p are equalities (Ci = Di, for i 5 p ) . The matrix A and n-vector U in equation 7.10 can be found by using
+
+
+
F,T = [ H ~ x bl, H ~ x b2, . . ., Hnlx b,,], n1
n7
i= 1
i= I
and U = Sx+ w. Given below is an algorithm to be implemented in a computer program. 1. Input Data a. n, m,p , and E (to replace zero, usuallyliesbetween 1o - ~ ) . b. R, a, Hi, bi, Ci, and Di for all i = 1,2, . . . ,rn. 2.
Znitialization Set X i = 0 and other preference.
zi
and
= 0 (hi = 0) for all i or use any
3. Testing EKT Conditions a. Calculate Ki, Ui (the ith component of U ) and then Ji from equations 7.1 1 and 7.12 for all i. b. A set of EKT conditions is reached if IUil e E and lJil -= E for all i. Otherwise go to Step 4. 4. Solving for u and v a. Solve zc and w from equation 7.15 by using LSMN [ 1 31. b. Calculate N by equation 7.16 and then go to Step 5 if N > 0 and f x u -e 0. Go to Part (c) otherwise. C. Update x by x u, and z w, and then go to Step 3.
+
+
5. Determining A K a. Calculate q by equation 7.17 and then find AKi from equations 7.18 and 7.19 for all i. b. Solve A x and Az from equation 7.10 by using LSMN. C. Update x by x A x and z by z Az, and then go to Step 3.
+
+
amming
Nonlinear
237
In using the algorithm, one should discover several set of EKT conditions. This canbe done by varying the initial values. Sometimes, intuitive judgment is helpful in deciding if the smallest one is the solution of the problem. V.
BARRIERMETHOD FOR SOLVINGNLP
As given by equation 7.1 in the sensitivity method, theNLP is rewritten here as Minimize f(x) Subject to: g(x) = 0 and C 5 h(x) 5 D
(7.21)
where x is an n-vector andf(x) is a scalar function. The constraints g(x) and h(x) are, respectively,p- and m-vector functions. The boundvectors C and D are constant. All the functions are assumed to be twice differentiable. It is important to mention that m may be greater than n but p can not. Any bound imposed on x may be considered as part of h(x). The problem is solved here by using Kuhn-Tucker necessary conditions in conjunction with barrier penalty functions. Involved in the method is a recursive process that solves a set of linear equations at each iteration. The equations arereduced to the least in number. The barrier parameter is generalized to a vector form in order to accommodate discriminatory penalty. A.
AlgorithmforRecursiveProcess
Newton's numerical method is used in the sequel to approach a solution(if one exists) of the problem.To acquire theK-T conditions, we introduce first nonnegative slack variables to convert the inequalities constraints. That is,
h(x) + s = D
I
h(x) - r = C '
(7.22)
where s and r are nonnegative m-vector functions. The logarithmic barrier function has been used extensively to avoid dealing with the harsh constraintof nonnegativeness on the slack variables; that is, to appendf(x) as (7.23)
Chapter 7
238
All the us and 'us are specified nonnegative. They maychange from one iteration to another in the process. It is known that theoptimization offb(x) and f ( x ) subject to the same constraints are the same as the us and ws approach zero. As such, one may optimizefb(x) by ignoring the nonnegative constraint on the slack variables. For simplicity, the argument x is dropped fromf(x), g(x), and h(x) to form the Lagrange function
+
L =fb y T g
+ wT(h+ s) - zT(h - r),
(7.24)
where y , w, and z are the Lagrange vectors associated with the constraints. Note thatthey are required to be nonnegative bythe K-T conditions for the problem. Differentiation of L with respect to x, s, and r, and then setting them equal to zero yields the optimaiity conditions for the appended problem. To facilitate the derivation, an operator V = (read as gradient) is used to mean
2
Enclosed by the brackets is the entry at the ith row andjth column. T denotes the transpose and the last three n-squarematrices are Hessian matrices. The K-T conditions can be obtained with respect to state and slack variables. For the state vector x, we have V L = V f + (Vg)y
+ (Vh)(w -
Z)
= 0.
(7.25)
Let S , R,W , and 2 be the diagonal matrices that contain the elements ofthe vectors s,Y, w, and z, respectively.Then the optimality conditions with respect to s and r are -S"U
+w =0
Nonlinear Programming
239
and -R"v
+ z = 0.
That is, sw=u RZ = v
(7.26)
The increment equations of equation 7.26 are (S+AS)(W+AW)=U ( R AR)(z A z ) = v
+
+
(7.27)
The penalty vectors u and v may alter each iteration but remain constant between iterations. By neglecting the terms A S A w and A R A z , we obtain from equation 7.27, AW = S" u - w - S-' ASW AZ = R-'V - z - R - ~ A R Z
(7.28)
Using the fact that ASw = W A S and ARz = ZAr, one may express equation 7.28 as AW = S"U - w - S" W A S AZ = R - ~ V- z - R - ~ Z A ~
(7.29)
Increments of slack variables and state variables are linearly related if high orders of A x are neglected. It follows from equation 7.22 that VhTAX+AS=D-h-s=dl VhTAx-Ar=C-h+r=d2
(7.30)
Thus, we can write the relation between the increments as A S = dl - V h T A x Ar = V h T A x- d2
1.
(7.31)
Substitutions of equation 7.31 into 7.29 give AW = S"(U A Z = R"(v
- W d l ) - w + S" W V h T A x
+ ZdJ - z - R"ZVhTAx
I
(7.32)
The increment equation 7.25, after all the variables are augmented, can be similarly determined. By neglecting the high orders of A x and A y , we have (VL),,, = V L + H A x + V g A y + Vh(Aw - Az),
where
(7.33)
Chapter 7
240 Y
H = V2f + C v k V 2 g k +
m
C(W - zk)V2hk. ~
(7.34)
k= I Substituting equals on 7.32 into 7.33 and then setting 7.33 equal to zero gives k= I
where A =H
+ Vh(S"
W
+ R"Z)VhT
(7.36)
and
b = V L + Vh(S"(u - W d l ) - R"(v
+ Zd2) -
(W
- z)).
(7.37)
The linearized equation of the equality constraint is g
+ V g T A x= 0.
(7.38)
Combination of equations 7.35 and 7.38 makes (7.39)
Being symmetrical, the ( n + p)-square matrix on the left can be inverted by fast means even for large n p. Computation time for Ax and Ay should not cause any problem in the process. Other increments can be readily found from equations 7.31 and 7.29.
+
1. Analytical Forms
It is necessary to derive first the n-vector V f n x p matrix V g , n x m matrix Vh, and n-square symmetrical matrices V2f V2gk7and V2hk.Then form the n-vector V L and rz-square matrix H in terms of the Lagrange multipliers according to equations 7.25 and 7.34: V L = Vf
+ (Vg)y + Vh(w - Z )
A wide class of NLP problems is expressible in the form of quadratic optimization. That is,
4 X ~ Q X+ aTx, gk(x) = $xTGkx+ B l x , hk(x) = fxThkx + JTx, f ( x )=
. . . ,p k = 1,2,. . . ,m. k = 1,2,
amming
Nonlinear
241
where Q , Gk, and Hk are symmetrical and a, B k , and Jk are n-vectors. The vectors and matrices required by equation 7.39 are Vf =Qx+a
+
+
+ HmX + Jm]
V g = [ G ~ x B1, G ~ x B2, . . . , GPx BPI Vh = [Hlx
+ J1, H ~ +x J 2 , .
v 2 f = Q,
2.
v2gk = Gk,
*
and v2hk = Hk
and
PenaltyVectors
Each component of u or v may be chosen differently to achieve the discriminative penalty. However, one may choose the equipenalty scheme if there is no preference. That is, u = v = PPn19 where 0 < p < 1 and em is the rn-vector with all elements equal to one. The penalty parameter p is required to approach zero as theprocess approaches an optimum. To meet such a requirement, we choose
(7.40) Note that p = 0 at an optimum according to the K-T condition. Start of Process. Initial values play an important partin the recursive process. Improper assumptions may cause divergence or convergence to a different solution (if one exists). It is known that the recursive process always converges to a solution if the initial values are close enough to it.If there is no preference, one may consider the following scheme. State Vector. 1. Assume x to be an estimated solution. 2. Make x satisfy V f = 0. 3. Set x = 0 if Step 1 or 2 fails. Slack Vectors. Use s = r = 1/2(0 - C ) > 0. Lagrange Vectors. Use w = z = [ 1 llVf 11]em and y = 0, where llVf 11 is the ll -norm. Penalty Parameter. 0.1 5 p 5 0.5 may be used. A large number can retard the process and a small number can cause divergence.
+
Chapter 7
242
Replacement of Zero. Two small numbers € 1 and €2 are to be used to replace zero for computer implementation. They may be different and exist between 10-6 and 1O-3.
B. ComputerImplementation
I . Initialization Assume a. x , s, r, w, and z. b. p, e l ,and e 2 . 2. Computation a. p = (1/2m)(wTs + zTr) and u = v = ppe,,. b. dl = D - h - s a n d d 2 = C - h + r . C. V L = Vf (Vg)y (Vh)(w - Z ) = 0.
(7.40) (7.30) (7.25)
3. Computation a. A = H + Vh(S" W b. b = V L + Vh(S"(u
(7.36) (7.37)
+
+
+ R"Z)VhT. -
W d l ) - R"(v
+ Zd2) - (W- z)).
4. Increments
(7.39)
AS = dl
- VhTAX} (7.3 1)
Ar = V h T A x- d2} AW = S - ~ U w - S-IASW}
AZ = R - ~ V- z - R - ~ A R Z } .
(7.28)
5. Size of Increment Determine two numbers according to As.Are si ri
L , - - . L , i = 1,2,..., m
1 1.
A ~i A z ~ ,-, i = 1,2,, .. , m wi zi
Then, set
Nonlinear Programming
243
Dl = 1
if N I 2 -1 = (-l/Nl) if Nl p2 = 1 if N2 2 -1 = (-1/N2) ifN2
-= -1 -= -1
A x = p1A x ,
As = p1As,
Aw = p2Aw,
A z = /12Az, and
6.
Ar =
#?, Ar,
A y = P2Ay.
Update
Ax =
A xA,s
As, Ar
=
=
Ar,
and
Ay = p2Ay.
Note that all the slack variables and Lagrange multipliers are nonnegative due to the choice of #?I and p2.
7. Test for Termination Compute: 1 2m
p =-(wTs
V L = Vf
+ zTr)
(7.40)
+ (Vg)y + (Vh)(w-
Go to Step 9 if both p go to Step 8.
LEI
Z)
and IIVLll
= 0. 5
(7.25)
~ (Il-norm) 2
aresatisfied.Otherwise,
8. Adjustment Go back to Step 2 if both N1 2 -0.995 and N2 2 -0.995. Otherwise, go back to Step 2 after having adjusted all the variables as follows.
s = s - O.O05As,
x = x - O.O05Ax, w = w - O.O05Aw,
z = z - 0.005Az,
r = Y - 0.005Ar,
and y = y - 0.005Ay.
9. Stop with solution: ( x , ys,, Y , w, and z).
VI.ILLUSTRATIVEEXAMPLES Example 1 Minimize
f = ( x 1- 2)2 + 4
Subject to: 2
2
XI +X2
- 1 5 0,
XI,X2
2 0.
Obtain the solution using the Kuhn-Tucker condition.
Chapter 7
244
Solution We can change the constraint to be two-sided as all the values of x I , should be greater than zero. The constraint will be in the form 0 5 x: +.x; 5 1.
x2
1. Form the Lagrange function
+ +
L(x, A) = (x, - 2)2 4
+ x;).
2. Search for the optimum candidates L, = 0:
+
L , ~= 2(x, - 2) 2hxl = o Lx2 = 2hx2 = 0. Solving for the second equation for Lx2 = 0, we have two possible solutions (xI = 0, h = 0). But also using the Kuhn-Tucker condition, which can be stated as follows,
a. At h = 0, solving for Lxl = 0 results in x I = 2, which means that the main constraint, 0 5 x: x i 5 1 violates. Then there is no solution at h = 0. b. h 2 0, x 2 = 0, based on the Kuhn-Tucker condition (x: x;) = 1, means that .x1 = f I ,
+
+
j’(-l, 0) = (-1 - 2)2 + 4 = 13,
f ( 1 , 0 ) = ( 1 - 2 ) 2 + 4 =h5=, l
h = -3 (out ofrange)
(inrange)
c. h 5 0, x2 = 0, based on the Kuhn-Tucker (x: x;) = 0, means that xI = 0,
+
+
f ( 0 , O ) = (0 - 2)2 4 = 8.
Then fmin =f( 1 , O ) = 5 . Example 2
Use the barrier algorithm to minimize the function: f ( x ) = x I x 2 Subject to:
1 5 xI - x2 5 2.
condition
Nonlinear Programming
245
Solution With the number of variables n = 2 and number of constraints m = 1,
D = 2,
h(x) = x1 - xq,
C = 1.
Use the form:
+
f ( x ) = & x T e x a T x = x1x2. Then
e=[!
i]and
a=O
Subject to 1 5 x1 - x2 5 2,
+
hk(x) = $xTHkx J:x,
Vf=Qx+a= [O
then h(x) = x1 - x2
then k = 1 , H I = 0
l][xl]+[:]=[::] x2
First Iteration 1. Initialization s = Y = J ( D- C ) = $(2 - 1) = 0.5.
Set xl = x2 = 0 to satisfy Vf = 0, w= =
+ llVf Illem,
where e,,, is an m-vector with element 1 w=z=(l+O)l= 1
Chapter 7
246
Take p = 0.7,
and
=
~2
= lo?
2. Computation
a.
u = v = ppe, = 0.25.
b.
dl = D - h - S,h = x1 - x2 = 0 - 0 = 0 then dl = 2 - 0 - 0 . 5 = 1.5 d2=C-h+r=l-O+0.5=
1.5.
C.
+
~ ~ = [ ,x2l - , 14'4 ~+ 2 z ] = [0 o- + l +l -l l ] = [ ~ ] 3. Computation a.
[
=
0 1
0] +(2+2)[ -:][I
- 11=
[4 -3
-3 4].
b.
b = V L + Vh[s"(tc
=
4.
[-:I.
Increments
-d l )
+
- r " ( ~ zd2) - (W - z ) )
amming
Nonlinear
247
As = dl - VhTAxVh = 1.5 - 0.8571[1
- 1][
-11
= -0.2143
Ar = VhTAx - d2 = 0.2143
- WAS)- w = 0.1286
AW = S"(U
Az = r-'(v - zAr) - z = -0.7286. 5. Size of Increment
As Ar = min( ,5.o o . 5 = -0.42856 N1 = min( S, 7) -0.2143 0.2143) 0.1286 -0.7286) N2 = min -, - = min -- = -0.7286 1 ' 1
1 y} 1
fV*>-1*p1=1 N2 > -1
+ 82 = 1.
Then there is no change in the previous calculated increments.
6. Update s =s
+ AS = 0.5 - 0.2143 = 0.2857
X=X+AX=
[ -:p,5:p,]
r = r + Ar = 0.5 + 0.2143 = 0.7143
+ A W= 1 + 0.1286 = 1.1286 z = z + AZ = 1 - 0.7286 = 0.2714. w =w
7. Test of Termination
1 1 + zTr) = "(1.1286 2m 2*1 = 0.2582 > ~1
p = -(wTs
* 0.2857 + 0.2714 * 0.7143)
+
-0.85714 1.1286 - 0.2714 0.85714 - 1.1286 0.2714 = -0.331
* 10"'
[-:I
+
1
Chapter 7
248
8.
Adjustment
NI = -0.4286 > -0.995
and N2 = -0.7286 > -0.995.
Then go to Step 2. Second Iteration 2.
Computation
a.
1 zTr) = 0.2582 2m ti = v = ppe, = 0.1807.
p = -(wTs
+
b. dl = D - h - S, h
XI - x2
= 0.85714
+ 0.85714 = 1.71428.
Then
dl = 1.0e - 04 d2 = C - h + r = 1 - 1.7142+0.7143 = l.Oe-04. C.
= -0.331
+
-0.85714 1.1286 - 0.2714 = 0.85714 - 1.1286 0.2714
x2+w-z v L =[ x , - w + J
[
+
[ -3
*
3. Computation a.
+
+
A = H Vh(s-’w r-’.z)VhT 0 1 1.1286 0.2714)[ :]Il =[1 o]+(- 0.2857 0.7143 = 4.3302 -3.33021 -3.3302 4.3302
[
+-
- 11
b. b = V L + Vh{s-l(u - wdl)- r-I(v
+ 2d2) - ( w - 2 ) )
1
Nonlinear Programming
249
4. Increments
[g]
:]
= -A-'[
AS = dl - VhTAxVh = -0.1247 Ar = V h T A x- d2 = 0.1247 AW = s-'(u - W A S ) w = -0.0035 A Z = r-'(v
- z A ~-) z = -0.0658.
5. Size of Increment N , =min(T,--] As Ar
= min
[-0.1247 0.12471 0.2857 ' 0.7143
= -0.4365
-0.0035 -0.0658 1.1286 ' 0.2714
Az Aw
--
N]>-1=$/3~=1 N2 > -1
j
/3* = 1.
Then there is no change in the previous calculated increments.
6. Update s = s + AS = 0.2857 - 0.1247 = 0.161
[
0.8571
x = x + Ax = -0.8571
] [ +
0 0624 0.9195 -0:0624] = -0.91 951
+ Ar = 0.7143 + 0.1247 = 0.8390 w = w + AW = 1.1286 - 0.0035 = 1.1251
Y
=r
z =z
+ AZ = 0.2714 - 0.0658 = 0.2056.
[
Chapter 7
250
7. Test of Termination
1 1 z T r ) = -(1.1251 2m 2*1 = 0.1768 >
p = -(wTs
=
[
* 0.161 + 0.2056 z 0.8390)
+
x2+w- +
:] [ =
+
-0.9195 1.1251 - 0.2056 0.9195 - 1.1251 0.20561 =
+
[:]
< E2.
8. Adjustment
Nl = -0.4365 > -0.995
and N2 = -0.2424 > -0.995.
Then go to Step 2. Repeating the previous step, we get the solution that satisfies the conditions E I = and ~2 = after 10 iterations. Table 7.1 tabulates the results of these iterations. The solution is x I = 0.995 and x2 = -0.995.
VII. CONCLUSION This chapter dealt with optimization techniques that fitmost nonlinear engineering applications. Nonlinear programming aims at solving optimization problems involving nonlinear objective and constrained functions. In Sections I and I1 classification of nonlinear programming problems was presented. The classification includedquadratic, convex, and separable programming. The sensitivity method for solving nonlinear programming variables was presented in Section 111. Also, a practical procedure for solving the problem was demonstrated with an alternative expression for the extended K-T condition to provide feasibility and optimality. In Section IV an algorithm for solving the quadratic optimization problem was presented in the form of sequential steps. A technique based on the barrier method for solving nonlinear programming problems was presented in Section V, where the recursive process was developed. VIII. PROBLEM SET
Problem 7.1 Solve the following as a separate convex programming problem. Minimize Subject to:
2 = ( x 1- 2)2
+ 4(x2 - 6)2
Nonlinear Programming 251
Chapter 7
252
Problem 7.2 Consider the problem Maximize
2 = 6x1
+ 3x2 - 4x1~2- 2.x;
- 3x22
Subject to: XI
+x2 5 1
Show that Z is strictly concave and then solve the problem using the quadratic programming algorithm. Problem 7.3 Consider the problem Minimize
2 = x:
+ x$
Subject to: 2x1
+ x2 5 2
-x1
+ 1 5 0.
1. Find the optimal solution to this problem. 2. Formulatea suitable function with initial penalty parameter p = 1. 3. Starting from the point (2,6), solve the resulting problem by a suitable unconstrainted optimization technique. 4. Replace the penalty parameter p by10. Starting from the point obtained in 3, solve the resulting problem. Problem 7.4 Minimize
f = (xI + 1)(x2- 2)
over the region 0 5 x1 5 2, 0 5 x2 5 1 by writing the Kuhn-Tucker conditions and obtaining the saddle point.
Nonlinear Programming
253
Problem 7.5 Minimize
2 2 = 2x1- x1
+ x,
Subject to: 2x1 2x1
+ 3x2 5 6
+
Xl,X2
x2
I4
3.0.
Problem 7.6 Minimize
2 = (x!
- 6)2 + (x2 - S)2
Subject to: x: - x; I0.
+
+
Using the auxiliary function (x, - 6)2 (x2- 8)2 p max{d - x2, 0 } , and adopting the cyclic coordinate method, solve the above problem starting from x1 = (0, -4)t under the following strategies for modifying p.
1. Starting from XI,solve the penalty problem for p1 = 0.1 resulting in x2. Then start from x2, and solve the problem with p2 = 100. 2. Starting from the unconstrained optimal point (6, S), solve the penalty problem for p2 = 100. (This is the limiting case of Part 1 for p1 = 0.) 3. Starting from XI,solve the penalty problem p1 = 100.0. 4. Which of the above strategies would you recommend, and why? Also, in each of theabove cases,derive anestimateforthe Lagrangian multiplier associated with the single constraint. Problem 7.7 Maximize
f(x) = x: - 2x22
Subject to: XI
- x2 = 2
x2
5 x; 5 x2 + 8.
Problem 7.8 Maximize
f ( x ) = 3x1x2- x; - x22
Chapter 7
254
Subject to: xI 5 0 and 0 5 x2 5 1.
Problem 7.9 Consider the following problem. Maximize
z = -x1 - x2
Subject to: x1
+ x2 5 8
x2 L 3 -x1
+ x2 5 2
xj 1.0.
1. Solvethis problem graphically. 2. Use the dual simplex method to solve this problem. 3. Trace graphically the path taken by the dual simplex method. 4. Solve this problem using the interior point algorithm.
REFERENCES McGraw-Hill, New 1. Himmelblau, D. M. AppliedNonlinearProgramming, York, 1972. and Nonlinear Programming, 2. Luenberger, D. G. Introduction to Linear Addison-Wesley, Reading, MA, 1973. 3. Mangasarian, 0. L. Nonlinear Programming Problems with Stochastic Objective Functions, Managenzent Science, Vol. 10, 1964. pp. 353-359. 4. McCormick, G . P. Penalty Function Versus Non Penalty Function Methods for Constrained Nonlinear Programming Problems, Mathematical Programming, Vol. 1 , 1971, pp. 21 7-238. 5. McMillan, C., Jr. Mathematical Programming, Wiley, New York, 1970. 6. Murtagh, B. A. and Sargent, R. W. H. Computational Experiencewith Quadratically Convergent Minimization Methods, Computer Journal, Vol. 13, 1970, pp. 185-194. 7. Pierre, D. A. Optimization Theory with Applications, Wiley, New York, 1969. 8. Powell, M. J. D. A Methodfor Nonlinear Constraints in Minimization Problems, in Optimization, R. Fletcher (Ed.), 1969. 9. Wilde, D. J. Optimum Seeking Methods, Prentice-Hall, Englewood Cliffs, NJ, 1964. 10. Wilde, D. J. and Beightler, C. S. Foundations of Optimization, Prentice-Hall, Englewood Cliffs, NJ, 1967.
amming
Nonlinear
255
Prentice-Hall, 11. Zangwill, W. I. NonlinearProgramming:AUnifiedApproach, Englewood Cliffs, NJ, 1969. 12. Zoutendijk, G . Nonlinear Programming, Computational Methods, in Integer and Nonlinear Programming, J. Abadie (Ed.), 1970. C. B. ComputerMethods for 13. Forsythe, G . E., Malcolm, M. A., and Moler, Mathematical Computations, Prentice Hall Inc., Englewood Cliffs, NJ, 1977. 14. Momoh, J. A., Dias,L. G., Guo, S . X., and Adapa, R.A. Economic operation IEEE Trans. on andplanningofmulti-areainterconnectedpowersystem, Power Systems, Vol. 10, 1995, pp. 1044-105 1. 15. Meigiddo,N. Progress in MathematicalProgramming:Interior-Pointand Related Methods, Springer-Verlag, New York, 1989. TheoryandMethods,Mathematics 16. Polyak,R.ModifiedBarrierFunctions, Programming, 54(2),1992, pp. 177-222. 17. Vanderbei,R. J., andShanno, D. F. AnInteriorPointAlgorithmfor Nonconvex Nonlinear Programming, Research Report, Statistics and Operations Research, Princetown University, SOR-97-2 1.
This Page Intentionally Left Blank
Chapter 8 Dynamic Programming
1.
INTRODUCTION
Dynamic programming (DP)is an optimization approachthat transforms a complex problem into a sequence of simpler problems; its essential characteristic is the multistage nature of the optimization procedure. More so than the optimization techniques described previously, dynamic programming provides a general framework for analyzing many problem types. Within this framework a variety of optimization techniques can be employed to solve particular aspects of a more general formulation. Usually creativity is required before we can recognize that a particular problem can be cast effectively as a dynamic program, and often subtle insights are necessary to restructure the formulation so that it can be solved effectively. The dynamic programming method wasdeveloped in the 1950s through the work of Richard Bellman who is still the doyen of research workers in this field. The essential feature of the method is that a multivariable optimization problem is decomposed into a series of stages, optimization being done at each stage with respectto one variable only. Bellman (1) gave it the rather undescriptive name of dynamic programming. A more significant name would be recursive optimization. Both discrete andcontinuousproblemscan be amenable to this method and deterministic as well as stochastic models can be handled by it. The complexities increase tremendously with the number of constraints. A single constraint problem is relatively simple, but even more than two constraints can be formidable. The dynamic programming technique, when applicable, represents or decomposes a multistage decision problem as a sequence of single-stage 257
Chapter 8
258
decision problems. Thus anN-variable problem is representedas asequence of N single-variable problems that are solved successively. In most of the cases, these N subproblems are easier to solve than the original problem. The decomposition to N subproblems is done in such a manner that the optimal solution of the original N-variable problem can be obtained from the optimal solutions of the N one-dimensional problems. It is important to note that the particular optimization technique used for the optimization of the N-single-variable problems is irrelevant. It may range from a simple enumeration process to a calculus or a nonlinear programming technique. Multistage decision problems can also be solved by the direct application of classicaloptimization techniques. However, thisrequires the number of variables to be small, the functions involved to be continuous and continuously differentiable, and the optimum points not to lie at the boundary points. Furthermore, the problem has to be relatively simpleso that the set of resultant equations can be solved either analytically or numerically. The nonlinear programming techniques can be used to solve slightly more complicated multistage decision problems. Buttheir application requires the variables to be continuous and for there to be prior knowledge about the region of the global minimum or maximum. In all these cases, theintroduction of stochastic variability makes the problem extremely complexand the problem unsolvable exceptbyusingsome sort of an approximation-like chance constrained optimization. Dynamic programming, on the other hand, can dealwithdiscrete variables, and nonconvex, noncontinuous, and nondifferentiable functions. In general, it can also take into account the stochastic variability by a simple modificationofthe deterministic procedure. The dynamic programming technique suffers from a major drawback, known as the curse of dimensionality. However, in spite of this disadvantage, it is very suitable for the solution of a wide range ofcomplex problems inseveral areas ofdecisionmaking.
II. FORMULATIONOF A MULTISTAGEDECISION PROCESS A.
Representation of a Multistage Decision Process
Any decision process is characterized by certain input parameters, X (or data), certain decision variables ( U ) , and certain output parameters ( T ) representing the outcome obtained as a result of making the decision. The input parameters are called input stage variables, and the output parameters are called output state variables. Finally there is a return or objective func-
mming
Dynamic
259
tion F , which measures the effectiveness of the decisions for any physical system, represented as a single-stage decisionprocess (shown in Figure 8.1). The output of this single stage is shown in equations 8.1 and 8.2.
where ui denotes thevector of decision variables at stage i. The objective ofa multistage decision process is to find u l , 242, . . ., u, so as to optimize some function of the individual stage returns, say, F(ft ,f2, . . . ,fn)and satisfy equations 8.1 and 8.2. The natureof the n-stage return functionf determines whether a given multistage problem can besolved as a decomposition technique; it requires the separability and monotonicity of the objective function. In order tohave separability of the objective function, we must be able to represent the objective function as the compositionof the individual stage returns. This requirement is satisfied for additive objective functions: I1
i= I
where ui are real, and for multiplicative objective functions:
B. Types of Multistage Decision Problems The serial multistage decision problems can be classified into the following categories. Objective
F = f fu. x) Input x
Output
T = t (u, x)
Decision u FIGURE 8.1 Single-stage decisionproblem.
T
Chapter 8
260
1. initial Value Problem
If the value of the initial state variable x,,+Iis prescribed, the problem is called an initial value problem; it is shown in Figure 8.2.
2. Final Value Problem If the value of the final state variable xI is prescribed, the problem can be transformed into an initial value problem by reversing the directions of uI, i = 1,2, n 1, which is shown in Figure 8.3.
+
3. BoundaryValueProblem If the values of both the input and output variables are specified, the problem is called a boundary value problem.
111.
CHARACTERISTICS OF DYNAMICPROGRAMMING
Dynamic programming has the following characteristics.
1 . Divisible into stages, policy decision is requrested at each stage. 2. Each stage has a number of associated states. 3. Efficientpolicydecision transformation of the current state is needed to associate the next stage. 4. Solution procedure is classifiedto find the optimum policy for the overall problem. 5 . Prescription of optimum policy for the remaining stages, for each possible state. 6 . Given the current stage, an optimal policy for the remaining stage is independent of the policy adopted in the previous stage (principle of optimality in dynamic programming). 7. By using the recursive relationship, the solution procedure moves backward stage by stage, each time finding the optimum policy for that stage until it finds the optimum policy starting at the initial stage.
FIGURE 8.2
Multistage initial value problem.
Dynamic Programming
261
FIGURE 8.3 Multistage final value problem.
IV. CONCEPT OF SUBOPTIMIZATIONANDTHE PRINCIPLE OF OPTIMALIN
A dynamic programming problem can be stated as follows. Find the values of u l , u2, . . . ,UN which optimize F = x7=lh(ui, xi+l) and satisfy the design equations ui= ti(xi, u~+~). Dynamic programming makes use of the concept of suboptimization and Bellman's Principle of Optimality in solving the problem. Bellman stated the following as part of his Principle of Optimality. An optimal policy (or a set ofdecisions) has the property that whatever the initial state and initial decisions are, the remaining decisionsmust constitute an optimal policy with regard to the state resulting from the first decision. In developing a recurrence relationship,supposethatthe desired objective is to minimize the N-stage objective function f, which is given by the sum of the individual stage-returns:
and satisfying the design equations xi = ti(xi+,,ui). Figure 8.4 shows howto apply this to dynamic programming. Now consider the first subproblem by starting at the final stage, i = 1. If the input to this stage x2 is specified, then according to the principle of optimality, u1 must be selected to optimizefi. Irrespective of what happens to other stages, u1 must be selected suchthatf(ul, x2) is an optimum for the input x2. If the optimum is denoted fi*, we have
This is called a one-stage policy sinceonce the input stage s2 is specified, the optimal values offi , u, , and x2 are completely definedas shown in equation 8.6.
Chapter 8
262
9 Original problem
Principle of optimality appliedto the last component
Principle of optimality appliedto the last 2 stages
Entire systemis optimal
FIGURE 8.4 Illustration of theprinciple of optimality.
Next, consider the second subproblem by grouping the last two stages together. Iff,* denotes the optimum objective value of the second subproblem for a specified value of the input x 3 , we have
The principle of optimality requires that u1 be selected so as to optimizefi for a given x2. Since x2 can be obtained once u2 and x 3 are specified, equation 8.7 can be written as
Thusf; represents the optimal policy for the two-stage problem. Therefore, rewriting equation 8.8 yields:
In this form, it can be seen that for a specified input x3, the optimum is determined solely by a suitable choice of the decision variable u2. Thus the
mming
Dynamic
263
optimization problem stated in equation 8.7, in which both u2 and ut are to be simultaneously varied to produce the optimumf2*, is reduced to each of these subproblems involving only a single decision variable, and optimization is, in general, much simpler. This idea can be generalized and the ith subproblem defined by .h*(xi+l) =
opt
4,4-],...Jq
+
If(ui, ~ i + l )
*
+.fi(x29
ul)J,
(8.10)
which can be written as (8.1 1) where.h$l denotes the optimalvalue of the objective function corresponding to the last i - 1 stages, and si is the input to the stage i - 1. By using the principle of optimality, this problem has been decomposed into i separate problems, each involving only one decision variable. V.
FORMULATIONOFDYNAMICPROGRAMMING
Dynamic programming is a transformation from amultiple vector decision process to a series of singlevector decision processes. It converts aproblem that contains many vectors for decision processes but contains only one vector to decide at a time. The single vector is usually reduced to contain the minimal number of components for ease of optimization. Dynamic programming not only reduces the complexity of problems, but also provides a form that can be solved effectively and iteratively on digital computers. Consider an optimizationproblem that consists of n decision (control) y,,, v,,, v,-~ , vectors vk for k = 1,2, . . .,n and a scalar objective function P,,( . ..,vl) defined as state n. The vectors vk have rk components, which may change with the runningindex k. The decision vectors vk are constrainedin a set S, which is usually described by equations and/or decision vectors in S for a given state vector y,,. . .. ,v,) be the objective function at stage k Now, let Pk(Yk,uk, where k = 1,2, . .. ,n, and let pk(yk) Optimize[Pk(yk,
vk, vk-1,
-
9
VI)] = Pk(yk, uk, Uk-1,
* *
9
VI).
(8.12) Equation 8.12 is the optimized objective function at stage k. The optimization is carried out with respect to vl ,v2,vk in S for a given yk. The optimal decision or control vectors u, where i = 1,2, ... ,k are known vectors and hence the optimized objective function is a function of yk only as
Chapter 8
264
indicated by the left side of equation 8.12. As such, the problem is equivalent to determining equation 8.12 for k = n. The state vectors yk have n k components, which may change with k. By properly choosing the integer rk, we can always formulate the objective function as follows. ~~(~~,~k,~k-1,...,~~)=fk(yk,~k)o~k-l(yk-l~uk~Vk-1~..~~u1)~
(8.13) for all k = 2, 3,4, . . . , n with Pk(y l , u I ) = f l ( y , , u l ) for k = 1. The symbol "0" is called an operator, which may be addition, multiplication, or comparison, and it may alter from stage to stage. The scalar functionfk(yk,uk)is to be chosen ina manner such that it constitutes a part of the objective function. The relation specifies the state vector yk, Yk-1 = T k ( Y k ,
(8.14)
Vk).
Bothfk and Tk denote a transformation or mapping that may be described in different forms. There are some known types of constraints that can be replaced by equation 8.14. For example: 81(Ul)
+ 92(v2)+ - + gn(u11) 1. c * *
(or L C )
(8.1 5 )
where gs and hs are scalar functions of the decision vectors. From equation 8.15, we choose the state equation 8.14 to be yk-l = yk -gk(uk)
for all k = 1,2,
. .. ,n.
Summation of these equation gives
Hence, yll 1C can be concluded if yo = 0 is assumed. Similarly, from equation 8.16, we select the state equation to be
where hk(uk)# 0 and k = 1,2, . . . ,n. Multiplication of these equations gives h1u1h2v2, . . . , h,,v,, = y n / y o2 C and hence yll 1. C follows if yo = 0 is chosen. Thus, we may choose dynamic programming to get P,l(yfl)and then make further optimization for y , 3 C(> C). We call an optimization problem decomposable if it can be solved by recursive optimization through N stages, at each stage optimization being done over one decision variable. We first define nzonotonicity of a function, which is used subsequently.
mming
Dynamic
265
Definition. The function f (x, y ) is said to be a monotonic nondecreasing function of y for all feasible values of x if yl < y2 =+f(x, yl) f( x , y2) for every feasible value of x. Conversely, the function is said to be monotonic nonincreasing if y1 < y2 3 f(x, yl) 5 f(x, y2) for every feasiblex. Theorem 1. In a serial two-stage minimization or maximization problem
if
(a)
the objective function42 is a separable function of stage returns fl
(b)
(X19U , ) a n d h ( X 2 , U21)9 and 4 2 is a monotonic nondecreasingfimctionoffi
fo for every feasible
value of f2, then the problem is said to be decomposable.
Proof of Theorem 1 Putting N = 2 in the problem, the objective function q52(f2,f l ) is separable if 42=f2 o =fl. Assuming that this condition holds true, then suppose that 42is a monotonic nondecreasing function of fl for feasible valuesoffi. We prove the theorem for the minimum case. For the maximum case the proof is on identical lines. Following the notation introduced before, we note that the following expressions are equivalent. (8.17a) (8.17b) (8.17~) The last form is possible because of the transformation relation
x1 = t 2 W 2 9 U2).
(8.18)
Also F2(&) = ~ i $ T f l ( x ,U, , ) = n ” * ( X l ,
u 2 9
Ul).
(8.19)
Let
(8.20) r
1
Comparing equations 8.17 and 8.21, we get Fi(X2) 1.F2(X2).Let
(8.21)
Chapter 8
266
(8.22) Then
Since 42 is a monotonic nondecreasing function off,, the above inequality implies
(8.24) Now
And from equations 8.15 and 8.17, we get F2(X2) = F . ( X 2 )or F2(X2) =
9;v; Fl (XIN
0
0
The following theorem is an extension of the above to an N-stage optimization problem. Treating stages N - 1 to 1 asa single stage, Theorem 2 is a direct consequence ofTheorem 1 and needs no further proof. Theorem 2. r f the real-valued return function t$N(fN,fN-I
. . .,fi)satisJjes
(a)
the condition of separability,that is,
(b)
where @N( fnl.fN- I . .. f1) is real-valued, and 42 is a monotonic nortdecreasing function of 4N-1,for every fN; then
&,, is decomposable: that
is,
The above theorems indicate that monotonicity is a sufficient condition for decomposability. The converse has not been proved. In fact the monotonicity condition is not necessary.
ming
Dynamic
267
Often, it is tedious in applications to judge an objective function. Operators that belong to one of the following cases are monotonic and hence the decomposition theorem is applicable. Case a.
All the operators are additions.
Inequality (equation 8.23) because
Pk-l(yk-I) 2 Pk-l(Yk-l,
--
vk-l, vk-29
9
211)
for maximization, and theinequality reverses for minimization by definition. Note that abstractions can be converted to additions by absorbingthe negative signs. Case b. All the operators are multiplications and
k = 1,2,..., n.
fk(yk, vk) 2 0 for all
The reason is the same as in Case a. Case C.
Combination of ( a ) and ( b ) .
The reason is the same as Case a. Case d. All the operators are comparison. In such a case, the maximization is defined as follows.
and minimization is definedby replacing max by min. For maximization, we have max[fk(yk,vk), pk-l( ~ k - ~ = ) ]fk. When the first is greater thanor equal to the second = Pk-l(yk-l). Otherwise, maxlfk(yk, vk),Pk-l(~k-l
9
vk-1,
-
*
9
vl)l
"A,
when the first is greater than or equalto the second, Pk-1 ( yk-1, vk- 1,... ,v1) otherwise. It follows from Pk-l(yk-l) 2 pk-l(yk-1, vk-I ,. ..,v1) that equation 8.23 holds true. Similarly, equation 8.23 can be shown valid for minimization. The taskof dynamic programming is to carry out the iterativeprocess generated by the decomposition theorem for a monotonic objective function. The advantage of using dynamic programming is that only one decision vector is involved ineach iteration. As a rule of thumb, itis desirable to reduce the integer rk as small as possible without violating. This requires more stages but fewer variables in each iteration of the optimization.
Chapter 8
268
VI.
BACKWARDANDFORWARDRECURSION
Throughout we have used a recursion procedure in which xj is regarded as the input and x--,., as the output for the nth stage, the stage returns are expressed as functions of the stage inputs, and recursive analysis proceeds from stage 1 to stage n. This procedure is called backward recursion because the stage transformation function is of the type xj-l = $(xj, 29). Backward recursion isconvenient when the problem involves optimization with respect to a given input x,, because then the output x. is very naturally left out of consideration. If, however, the problem is to optimize the system with respect to a given output xo,it would be convenient to reverse the direction, regard xj as a function of xj-l and uj, and put = $ ( x j , uj), 1 5 j 5 IZ and also express stage returns as functions of stage outputs and then proceed from stage n to stage 1. This procedure is called forward recursion. In problems where both the input x N and the output x. are given parameters, it is immaterial whether one proceeds in one direction or the other. Both parameters are retained during analysis, and the optimal solution is a function of both. In fact, for most multistage problems there is no essentialdifferencebetweenthesetwo procedures. In mathematical problems inputs and outputs are fictitious concepts and are interchangeable. One can visualize and solve the problem in any direction. It only involves a slight modification in notation. However, dynamic programming is also applicable to nonserial multistage systems, which are important in automatic control systems and in certain process technologies. In such systems stages are not all connected in seriesbut with branches and loops. It is in the application of such systems that the differencebetween forward and backward recursion procedures becomes not only significant but also crucial. For this reason we proceed to write the recursion formulae for the forward procedure explicitly. Assume that the return function $l(xlz,xo, ulz,. .., uj) is a function of the stage returnsf,(xj, xj-iuj) in the form
41=f n
oh-I 0
* * *
of2
and further assume that the stage transformation function is given by
Then, by defining
we postulate the forward recursion formulae as
Dynamic Programming
269
With this notation the optimum value of c $ ~ which we seek is denoted Fl(xo) which is obtained recursively through stages j = n - 1, . . ., 2 , 1. A. The MinimumPathProblem Consider the following dynamic programming problem where we must find the shortest path from vertex A to vertex B along arcs joining the various vertices lying betweenA and B in Figure 8.5. The lengths of the paths are as shown. The vertices are divided into five stages that we denote by subscript j . For j = 0, there is only one vertex A; also for j = 4 the only vertex isB. For j = 1,2,3, there are three vertices in each stage. Each move consists of moving from stage j to stage j 1, that is, from any one vertex in stage j to any one vertex in stage j 1. We say that each move changes the state of the system that we denote xj, and x. is the state in which node A lies. Notably, x. has only one value, say x. = 1 . The state x2 has three possible values, say 1,2, 3, corresponding to three vertices in stage 2, and so on. We call the possible alternative paths from one stage to the next decision variables, and denote by uj the decision variable that takes us from the state xj-l to state xi. The return or the gain from the decision uj we denote byf;(uj), return obviously being the function of the decision. In this problem we can identify uj with the length of the corresponding arc, and so simplify matters by takingf;.(uj) = uj.
+
+
FIGURE 8.5 The minimum path problem.
Chapter 8
270
We denote by F,(x,) the minimum path from the state, from x. any vertex in state x., Thus, F2(1) denotes the minimum path from A to vertex 1 of stage 2. The problem is to determine the minimum path F4(x4), and the values of the decision variables u l , u2,u3,and u4 which yield that path. Let us look at the problem in the following way. The value of 214 can either be 2, 8, or 7. If u4 = 2, => x3 = 1; similarly u4 = 8 => x3 = 2, and if u4 = 7 => x3 = 3, the minimum path from A to B is either through x3 = 1,2, or 3. With x3 as 1, 2 or 3, the respective values of u4 are 2, 8, or 7. Thus 2 + F3(1) 8 F3(2)= min,,(u4 7 + F3(3)
+
+ F3(x3)).
Similarly we can argue that
or, in general
7 + F2(1) 3 + F3(3)
Hence
Finally
We have thus a general recursion formula F2(x2) = min(u2 = Fl(x1)) u2
that enables us to determine F4(x4) recursively. As we later show, the number of enumerations gone through in this way substantially reduces the total number of enumerations that wiIl have to be gone through if all possible paths are examined. We now tabulate the information given in the problem and the enumerative steps necessary to find F4(x4) with the help of the above recursion formula. We also simultaneously introduce the standard terminologyof dynamic programming.
mming
Dynamic
271
Tables 8.1 through 8.4 imply that a function of the type xjbl = $(xj, it the stage transformation function. xi-l may not be defined for all combinationsof x j , uj. A dash in the tables indicates that the transformation for that pair of uj) exists. It follows from the data of the problem and we call
TABLE 8.1 Stage 1 forminimum path problem.
TABLE 8.2 Stage 2 for minimum path problem.
TABLE 8.3 Stage 3 for minimum path problem. x2
3
-
1
3
2 3
3 -
4
2
5
-
6
7
8
2
1
1 -
-
2
-
"
TABLE 8.4 Stage 4 for minimum path problem.
1 1
1
3
2
Chapter 8
272
values xi, uj is not defined and therefore that transformation is not feasible. In Tables 8.5 through 8.8 the recursive operations using the recursive formulae are indicated. The minimum path from A to B is thus found to be 17. Tracing the minimum path and decisions backwards, the successive decisions are 6 , 3 , 6 , and 2 and the states are x. = 1, x1 = I , x2 = 2, x3 = 3, and x4 = 1. B. Single Additive Constraint and Additively Separable Return Problem Consider the following problem as another illustration of the method of dynamic programming. TABLE 8.5
Step 1 for the recursive opera-
tions.
~~
1 2 3
1
6 5 4
6 5 4
TABLE 8.6 Step 2 for the recursive operations.
1 2 3
6 -
-
6
-
5
-
-
-
5 -
4 5
8 4
9 -
1 -
-
1
0 1
-
1 11
-
8 3 9 13 11
TABLE 8.7 Step 3 for the recursive operations.
1
2 3
9 - 8 " - 15 - 1516 - 1 1 - 9 - 8 - - 1 4 - 1 4 - 1 5 1 1 - 9 " " 1 3 - 1 3 " " "
"
14 13
Dynamic Programming
273
TABLE 8.8 Step 4 for the recursive operations.
17
13
14
15
1
22
20
17
Find the value of uj that minimizes n
where 1 5 j 5 n, subject to the constraints n
ajuj 2 b, aj, b E R, aj 2 0, b j= I
uj 2 0.
The objective or return function z is a separable additive function of the n variable uj. We look upon the problem as an n-stage problem, the suffix j indicating the stage. We have to decide about thevalues of uj, and so the uj are called decision variables. With each decision Uj is associated a return function4(uj) which is the return at the jth stage. Now we introduce the variables xo, xl, x 2 , . . . ,x , defined as follows.
+
+ - - - + a,u, 3 b + - - - + a,-lU,-l = X , - anUn - - + a ~ - 2 ~ , - =2 xn-l - an-lU,-l
x , = alul a2u2 X,-1 = alul a2u2 x,-2 = alul 4- a2u2 4-
+
*
x1 = alul = x2 - a2u2
I-
(8.25)
These variables are called the state variables. We further notice that xj-l = $(xj, uj) for 1 5 j 5 n. That is, each state variable is a function of the next state and decision variable. This is the stage transformation function. Since x , is a function of all the decision variables we may denote by Fn(X,) the minimum value of z for any feasible function of x,, (8.26) the minimization being over nonnegative values of uj subject to x, 2 b. We select a particular value of u, and holding un fixed, we minimize z over the remaining variables. This minimum is given by
Chapter 8
274 .L(Un)
+ u,,u~inu,,-,V~(~~) +fi(u2) + . +.L-1(~n-1)1 = h ( U n ) + K,-I(x~-I)-
(8.27)
And the values of u1, u2, .. . , ull-l that would make &!l,fi(uj) minimum for a fixed un thus depend upon x,+I which in turn is a function of x,, and un. Also, the minimum z over all u,, for feasible x,, would be
If somehow F,,-l( x ~ , - ~were ) knownfor all u,, the above minimization would involve a single variable u,,. Repeating the argument, we get the recursion formula (8.29) W l ) =fib), which along with the relation xj-l = tj(xj, uj) defines a typical DP problem. If we could make a start with Fl(xl)and recursively go on to optimize to get F2(x2),F3(x3),. . . , &,(xl,), each optimization being done over a single variable, we would get F,l(x,l)for each possible x,,. Minimizing it over x,,. we get the solution. The following numerical examplesillustrate how this can be done. C. Single Multiplicative Constraint, Additively Separable Return Problem
Consider the problem
j=l
subject to the constraints n
2 k > 0,
uj 2 0.
j= I
We introduce the state variables xj defined as follows. * = U2Ul I k, = Un-1 * - U ~ U I
x, = unu,-l = * X,-1
...
Xn/Un
x2 = x3/u2 = UZUI, X I = x& = UI
-
1-
These are the stage transformations of the type
(8.30)
Dynamic Programming xj-1
= tj(Xj,
275
Uj).
Denoting by F,(x,) the minimum value of the objective function for any feasible x,, we can get the recursion formula 2 5j L n
r;J(xj) = minV;(uj) +&-1(xj-*)J, uj
(8.3 I )
mxl) = f i ( U d
which will lead to the solution.
Example 1 (see Tables 8.9 and 8.10)
and u l , u3,u3 are positive integers.
TABLE 8.9 Stage 1 for Example 1. u.
1
2
3
4
5
6
$Cui>
1
4
9
16
25
36
TABLE 8.10 Stage 2 for Example 1.
1
2
3
4
5
6
Chapter 8
276
Solution The state variables are x3 = ~
1 ~ 2 5 1 ~6 ,3
x2 = ~
3 1 = ~ ~3 1 ~ 2 , X I
= x2/u2 = 141.
The solution is worked out as follows. Stage return 2
f , ( t q ) = 29;
j = 1,2,3.
Stage transformation
Recursion Operation (see Tables 8.1 1 through 8.13)
The answer is 38 with u3 = 1, u2 = 1, u1 = 6 . D. Single Additive Constraint, Multiplicatively Separable Turn Problem Consider the problem: Maximize
(8.32)
iif;(uj) j= 1
Subject to: n
ajuj = k
(8.33)
j= 1
and uj 2 0 ,
aj
2 0.
Here, our state variables are: TABLE 8.1 1 Step 1 for the recursive operation. X1
1
2
3
4
5
6
Fl (x1)
1
4
9
16
25
36
mming
Dynamic
277
TABLE 8.12 Step 2 for the recursive operation.
1 2 3 4 5 6
_ " "
1 1 1 1 1 1
TABLE 8.13
x
i
1 2 3 4 5 6
1
" " _
1 " " 4 1 " " g " - g - 1 - " - 1 6 - - 1 6 4 - 1 - 2 5 - 2 5 - - - 1 9 - - 3 6 3 6 9 4 - -
4 4 " 4
1
2 5 10 17 26 37
Step 3 for the recursive operations.
1
2
"
4
5
4
-
6
1
2 5 1 - 9 " - 1 0 - 2 1 4 - 1 6 - - 1 7 1 "- 2 5 - 2 6 - - - 3 6 3 7 1 4 9 1
"
3
"
"
2
3
" " 2 " " " 5 - 2 - 2 1 0 5
4
5
-
6
-
- -
2
3 6 11 18 27 38
n
x, = c a j u j = k , j= 1
Putting
the general recursion formula is
(8.34)
Chapter 8
278
VII. COMPUTATIONALPROCEDURE IN DYNAMIC PROGRAMMING We now discuss the use of the recurrence relationships. Dynamic programming begins by suboptimizing the fast component which means the determination of h*(x2) = OPt[FI
9
x2)I.
UI
The best value of the decision variable u I , denoted UT, is that which makes the objective function f 1, denoted A*, hit its maximum value. Both depend on the condition of theinput or feed thatcomponent 1 receives from upstream. Figure 8.6 shows a summary of the suboptimization problem of Stage 1. Next, we move up the serial system to include the last two components. In this two-stage suboptimization, we have to determine
h*(x3) = opt if', 4.2)
(UI 9 x2)
I
+ F2(u29 X3)I
since all the information about the first stage has already been calculated. Then the result can be substituted to get the following equation for the second stage,
+ F 2 h 2 9 x3119
h*(X3) = OPtlf;*(X2) u2
assuming the sequence has been carried on to include (i - 1) at the end of the components. Then the solution will be the solution of h*(xi+l)=
opt U,.U,-I.
....UI
[Fi
+ Fi-1 + + Fl]. * *
However, all the information regarding the suboptimization of (i - 1) end components is known and has been entered in the table corresponding to .fiL,. Then we can obtain
FIGURE 8.6 Suboptimization problemof component 1 for various settings of input state variable x2.
Dynamic Programming
279
thus the dimension of the ith-stage has been reduced to one.
VIII. COMPUTATIONALECONOMY IN DP Dynamic programming for discrete problems is an enumerative procedure.
A number of alternatives areexamined at each stage and some selected for further examination. Forlarge-scale problems, the number of alternatives is very large. Should we examine all the possible alternatives? The question now is if there is any saving in DP. And if so what is the orderof the saving. Consider a problem with n stages and leteach decision variable uj have p possible valuesand alsoeach state variable uj havep feasible valuesand an additive return function. In the direct exhaustive search, a feasible solution is specified by an input value x, and the values of the decision variables uj (j = 1,2, . . . ,n) and each component hasp values. Then the total numberof feasible solutions is p"" . To get the objective from eachfeasible solution, we have to add two at a time involving (n - 1) additions; it means (n - 1)p"" additions. Finally to choose the optimum of these we have to make (p"") comparisons. The total numberof steps is
N(DS) = (n - l)p"+l +pn+l- 1 = ng"+' - I .
(8.35)
But for dynamic programming, for stages starting at the second one upto n, for every combination, p2 combinations per stage, we need only one addition. The totalnumber of additions is (n - l)p2 additions.Also at each stage starting from the first one, for each value of xj, p numbers have to be compared in (p - 1) comparisons giving a total of {np(p - 1)). To get the optimum value, p numbers have to be compared. Then the totalnumber of additions and comparisons in DP are N ( D P ) = ( n - l)p2+np(p- 1)i-p- 1 = ( h - l)p2-(n- 1)p-
1, (8.36)
which is much less than in the exhaustive search case.
IX. SYSTEMS WITH MORE THAN ONE CONSTRAINT Dynamic programming can be applied to problems involving many constraints. In multiconstraint problems, there has to be one state variable per constraint per stage. Sometimes it is possible to take advantage of the problem structure to return the number of state variables.
Chapter 8
280
Example 2
Maximize
u:
+ uf + u:
and u l ,u2,u3 are positive integers. Solution The two sets of state variables are
x1 = x2/u2 = 211
y1 = y 2 - u 2 = u 1 .
The feasible values ofuj are 1, 2, 3,4. For stagej= 1, the stage transformation gives the following possible values of X I , y1. 241
x1 yl
1 2 3 4, 1 2 3 4, 1 2 3 4.
Because of the constraints we do not need to consider x j , y j> 6 (see Tables 8.15 through 8.17). Hence max F3(x3,y 3 ) = 18 for ( x 3 , y 3=) (4,6). Tracing back, the optimal decision variables are (1, I , 4) or (1 , 4, 1) or (4, 1 , I).
Dynamic Programming
281
TABLE 8.15 Stage1 of optimization of R X l , Yl)
I 2 3 4
2 3 4
=
4.
1
1 4
1 2 3 4
16
9
TABLE 8.1 6 Stage 2, F2(x2,y2)= x ; m
2 3 4 1 2 3 1 2 I
1 2 3 4
2 3 4 1 2 3 1 2 1
Lg + Fl (x,, y,)J.
4
5 10 17 5 8 13 10 13 17
9
16 1 4 9 1 4 1
TABLE 8.17 Stage 3, F3(x3,y3= r n a x t d Lh
x2 u3
Y2)
3 11 I
18 14 6 2 14 3 4
.
"
1 2 3 4 4 6 1 2 3 1 2 1
2 5 10 8 17 13 2 5 10 2 5 2
-
3 4 5 3 4 5 4 5 5
5 10 17
-
x3
Y3
F 3 ( ~ 3 ,~ 3 )
1 2 3 4 4 6 2 4 6 3 6 4
3 4 5 5 6 6 4 5 6 5 6 6
8 13
Fz(x2, y2)J.
Lg+ F2(X2, Y2)J
Y2
2 3 4 4 5 5 2 3 4 2 3 2
+
2 3 4 2 4 6 3 6 4
6 9
9
11 14 18
3 6 11 9 18 14
-
-
r _ . ">- " "
I " " . " ?
Chapter 8
282
X.
CONVERSION OF A FINAL VALUE PROBLEM INTO AN INITIAL VALUE PROBLEM
Before we discussed DP with referenceto aninitial value problem. But if the problem is a final value problem as shown in Figure 8.7, it can be converted into an equivalent initial value problem. Ifthe state transformation was given by equation 8. I xi
i = 1,2, . . . ,n.
= ti(xi+, , ui)
Assuming the inverse relations exist, we can write
i = 1,2, . . . ,n, where the input state to stage i is expressed as a function of its output state and its decision variable. Also if the objective function wasoriginally expressed as xi+l
&(xi, ui)
i = 1,2,. . . ,n,
Fi =f;'(xi+l, ui)
it can be used to express it in terms of the output stage and the decision variable as Fi
=J;:[&(Xi,
Ui),
xi]
= Z ( X ~ U ~ )i, = 1,2, . . .,n.
Then we can use the same original approach as before in this new problem. XI.
ILLUSTRATIVEEXAMPLES
Example 1 Maximize
2 x1
+ x22 - x32
Subject to: XI +x2
+x3 3 14
FIGURE 8.7 Conversion of (a) final value problem to (b) initial value problem.
Dynamic Programming
283
Put
+
u3 = x1 + x2 x3 u2 = x1 + x2 = u3 - x3 2.41
= x1 = u2
- x2
F,(U~) = [x;]= (
~ 2 ~ 2 ) ~ .
+
Substituting in Fz(u2) = max,, [x: (2.42 - x ~ ) ~by J ,calculus, a function is a maximum if its partial differential equals zero. F2(u2) = [x: (u2 - ~ 2 ) ~is] maximum if
Again, by calculus, a function ismaximum,if equals zero.
F3(~3)= [-X;
+
its partial differentiation
+ 5(~3- x ~ ) ~ J
is maximum if
Hence
Obviously F3(x3) is the maximum for u3 = 14, which means that the maximum value of (x; x: - x:) is therefore (-245). Back substitution to calculate the xl, x2, and x3 gives
+
Chapter 8
284
Then the final result isf,,,
= (-245), xI = (3.9, x2 = (-7), and x 3 = (17.5).
Example 2 The network shown in Figure 8.8 illustrates a transmission model with node (1) representing the generation unit, and nodes (2) to (7) representing load centers. The values associated with branches are power losses.
1. Derive an optimal policyofsupplying the load at node (8) from the generation node (1) with minimum losses. 2. Derive an optimal policy of supplying the load at node (8) from the generation node (1) with minimum lossesat node (I), butalso supplying a load at node (4). Solution
1. This is a minimization problem,
q(xj)= minluj + &l(xj-l)J, UJ
where, uj is the connection from stage (j- 1) to stagej , based on the fact that Fl(ul) = 111. Then, Fl(2) = 3 and F1(3)= 5 . F2(-q) = min[u2 u2
+ F&l)] +
= min[F2(4), F2(5)]
+ +
F2(4) = min[lO F1(2),11 F1(3)]= min[ 10+ 3, 1 1 + 51 = I3 F2(5) = min[4 +f1(3), 12 F1(2)]= min[4 5 , 12 31 = 9.
+
+
Then, F2(xz) = min[F2(4),F2(5)] = min[ 13,9] = 9. In a similar way,
FIGURE 8.8 TransmissionmodelforExample 2.
,
Dynamic Programming
285
F3(x3) = min[u3 u3
+ F2(x2)] = min[F3(6), F3(7)].
+ +
u3
+ +
+
+
F3(6) = min[l5 F2(4),10 F2(5)] = min[l5 13, 10 91 = 19 F3(7) = min[9 F2(4),16 F2(5)] = min[9 13, 16 91 = 22.
+
+
Then F3(x3) = min[F3(6), F3(7)]= min[l9,22] = 19 F4(x4) = min[u4 u4
= min[(9
+ &(x3)] = min[9 + F3(6), 8 + F3(7)] + 19), (8 + 23)] = 29.
Then the minimum route is (node (1) - node (3) - nodle (4) node (6) - node (8)), and the value of the losses is (28). 2. We must pass by the node (4). Then the system is divided into two separate problems omitting node ( 5 ) as shown in Figure 8.9. For the first part of the system: Fl(U1) = 4
-
Then, Fl(2) = 3 and F1(3) = 5 F2(4)= min[u2 Fl(xl)] = min[10 u2
+
= min[l3, 161 = 13.
For the second part of the system:
FIGURE 8.9 Reduced transmissionmodel.
+ F, (2), 11 + Fl(311
Chapter 8
286
+
F3(x3= min[u3 F2("c2)]= min[F3(6),F3(7)] u3
'43
+
= min[( 13 1 9 , (1 3
F4(x4) = min[u4 u4
= min[(9
+ 9)] = 22
+ F3(x3)] = min[9 + F3(6), 8 + F3(7)] + 28), (8 + 22)] = 30.
Then the minimum route is {node (1) - node (2) - node (4) - node ( 7 ) - node (8)}, and the value of the losses is (30).
Example 3 A computer company has accepted a contract to supply 100 computers at the end of the first month and 150 at the end of the second month. Thecost of manufacturing a computer isgivenby $(70x 0 . 2 ~ where ~ ) x is the number of manufactured units in that month. Ifthe company manufactures more computers than needed in the first month, there is an inventory carrying charge of $80 for each unit carried over to the next month. Find the number of computers manufactured in each month to maintain the cost at a minimum level, assuming that the company has enough facilities to manufacture up to 250 computers per month.
+
Solution
The totalcost is composed ofthe productioncost and the inventorycarrying costs. The constrained optimization problem can be stated as follows. Minimize: f ( x l ,x2) = (70x1
+ 0.224) + (70x2+ 0.2~:)+ 80(x1 - 80)
Subject to: XI
2 100,
XI
+ x2 = 250
and x I ,x2 are positive integers, where xl,x2 denote the number of computers manufactured in the first and second months, respectively. This problem can be considered as a two-stage decision problem as shown in Figure 8.10. Assuming the optimal solution of the first month equals x: = 100 2.42, the objective of the first month will be
+
f;" = 70(100 + 111) + 0.2(100 + ~
1
=) 9000 ~
+ 110~1+ 0.224.
The cost incurred in the second month is given by f 2 ( ~ 2 u,I )
= 70x2
+ 0 . 2 3 ~+; 10~1.
The total cost will be:
Dynamlc Programming
207
FIGURE 8.10 A two-stage decision problem.
But the amount of inventory at the beginning of the second month plus the production in the second month must be equal to the supply in the second month. We have x2 + ~ = l 150
+ ~1
= 150-~2.
Substituting for u1 in the total cost function, we get
+ 120(150 - ~ 2 +) 0.2(150 - x2)* + 70x2 + 0 . 2 ~ : = 31500 - 110x2 + 0.4&
F = 9000
since this last equation is a function only in x2. Then we can get the optimum value by setting aF ax2
-= 0.0 j 110 = 0.8x2 + x;
= 138.
Checking the second derivative,
Then the second month’s production of 80 corresponds to the minimum cost. The first month’s production and the inventory will be x: = 112, ul = 12 and the minimum total cost equals ($23937.6).
XII. CONCLUSIONS This chapterpresented dynamic programming asan optimization approach able to transform a complex problem into a sequential set of simpler problems. Both discrete and continuous problems were considered. The formulation of a multistage decision process was explained step by step by
Chapter 8
288
discussing its representation as well as the types of multistage decisionproblems. Characteristics of the dynamic programming method were discussed in detail. The concept of suboptimization and the principle of optimality were used as an initial stage in presenting the formulation of dynamic programming as a consequence of the simple optimization problem. The chapter also discussed the different forms of dynamic programming approaches used in continuous and discrete optimization problems, in particular forward and backward dynamic programming. The derivation of the recursive formulae used for both approaches was presented. Computational procedures were shownthrough some illustrative examplesthat also explained the computational ecomony in dynamic programming. All the discussion was devoted to optimization problems withonly oneconstraint; however, a discussion for systems with more than one constraint and the conversion of a final value problem into an initial value problem was presented in the last two sections of the chapter, with an illustrative example provided to support the argument. Finally, the chapter gave a set of unsolved problems for training and understanding the dynamic programming approach. XIII. PROBLEMSET Problem 8.1 Consider a transportation problem with m sources and n destinations. Let ai be the amount available at sosurce i, i = 1,2, . . . ,m, and let bj be the amount demanded at destination j , j = 1,2, . . . ,rz. Assuming that the cost of the transporting xVunits from source i to destination j is hV(xij),formulate the problem as a dynamic programming model. Problem 8.2 Solve the following linear programming problem as a dynamic programming model. Maximize
2 = 4x1
Subject to:
and nonnegative for all i’s.
+ 4x2
mming
Dynamic
289
Problem 8.3 Solve the following nonlinear programming problem as adynamic programming model. Maximize
2=
7-4+ 6x1 + 5xi
Subject to: x1
+ 2x2 5 10
X1
- 3x2 5 9
xi 2 0
and nonnegative for all i’s. Problem 8.4 Find the allocation for maximal profit when four resource units are available asshown in Table 8.18, where the numbers denote dollarsper resource unit. Problem 8.5 Maximize the hydroelectric power P(X), X = (x,, x2,X#, building dams on three different river basins, where P ( X ) =fr ( 4 +f2(x2)
produced by
+h(x3)9
andjj(xi) is the power generated from the ithbasin by investing resource xi. The total budgetaryprovision is x1 x2 x3 5 3. The functionsfl, f2, and f3 are given in Table 8.19. An integer solution is required for this problem.
+ +
Problem 8.6 Use Table 8.20 regarding buying and selling only in a deregulated power system with limited energy storage. A utility has a limited electrical energy
TABLE 8.18 DataforProblem 8.4. Resource Unit Plant
1
1
5
2 3
9 12 14
4
Plant 2
Plant 3
7 10 12
8 11 14 16
13
Chapter 8
290
TABLE 8.19 Data for Problem8.5. Xi fl
6 r,
I
2
3
0
6 2 1 6 3
4
5 0
f3
0
4
5
TABLE 8.20 Data for Problem8.6. Price Month
(89
MWH
January Feburary March April May June July August September October November December
18 18 18 15 14 15 17 17 17 17 17 18
20 19 16 17 16 16 16 17 18 19 19 19
storage parity of 1OOOMWh and must use this to the greatest advantage. The price of the energy changes from month to month in the year. Assume that the energy storage system must be inspected (cleaned) once a year: it takes one month to complete and is scheduled for the first of July each year. Problem 8.7
The network shown in Figure 8.1 1 represents a DC transmission system. In the network, the nodes represent substations. The values associated with the branches are voltage drops (pu). Derive an optimal policy of supplying a load at node (6) from the generation at node (1) with minimum voltage drop.
Dynamic Programming
291
FIGURE 8.11 A DC transmission system for Problem 8.7.
REFERENCES 1.
Bellman, R.Dynamic Programming,Princeton University Press, Princeton, NJ,
1957. 2. Bellman, R. andDreyfus,
S. Applied DynamicProgramming, Princeton University Press, Princeton, NJ, 1962. 3. Dano, S. Nonlinear and Dynamic Programming, Springer-Verlag, New York,
1975. 4. Hadley, G. F. Nonlinear and Dynamic Programming, Addison-Wesley, Reading, MA, 1964. 5. Howard, R. A. DynamicProgrammingandMarkovProcesses, Wiley,New Y ork,1960. 6. Kaufman, A. and Cruon, R.Dynamic Programming, Academic, 1967. 7. Larson, R.E. State Increment Dynamic Programming, American Elsevier,1968. 8. Markowitz, H. and Manne, A. S. On the Solution of Discrete Programming Problems, Econometrica (January 1957), pp. 84-1 10. 9. Neinhauser, G. L. htroduction to Dynamic Programming, Wiley, New York, 1966. 10. Sasieni, M.W., Yaspan, A., and Friedman, A. Operations Research, Methods and Problems, Wiley, New York, 1959. 1 1 . Wagner, H. M.Principles of Management Sciences, Prentice-Hall, Englewood Cliffs, NJ, 1970. 12. Denardo, E. V. Dynamic Programming Models and Applications,Prentice Hall, Englewood Cliffs, NJ, 1982. 13. Dreyfus, S. E., and Law, A. M.The Art and Theory of Dynamic Programming, Academic Press, NY, 1977. Norman, J. M. An Instructive Exercise in Dynamic 14. Dixon, P., and Programming, IIE Transactions, 16, no. 3, 292-294, 1984. By theDozen, American Math 15. Gale,D.TheJeepOnceMoreorJeeper Monthly, 77, 493-501. Correction published in American Math Monthly, 78, 1971, 644-645.
This Page Intentionally Left Blank
Chapter 9 Lagrangian Relaxation
1.
INTRODUCTION
In the last decade, Lagrangian relaxation has grown from a successful but largely theoretical concept to a tool that is the backbone of a number of large-scale applications. While there have been several surveys of Lagrangian relaxation (e.g., Fisher [l] and Geofrion [2] and an excellent textbook treatment by Muckstadt [6]), more extensive use of Lagrangian relaxation in practice has been inhibited by the lack of a “how to do it” exposition similar to the treatmentusually extended to linear,dynamic, and integer programming in operations research texts. This chapter is designed to atleast partially fill that void and should be of interest to both developers and users of Lagrangian relaxation algorithms. Lagrangian relaxation is based upon the observation that many difficult programming problems can be modeled as relativelyeasy problems complicated by a set of side constraints. To exploit this observation, we createaLagrangian problem inwhichthe complicating constraintsare replaced with a penalty term in the objective function involving the amount of violation of the constraints and their dual variables. The Lagrangian problem is easy to solve and provides an upper bound (to a maximization problem) on the optimal value ofthe original problem. It can thusbe used in place of linear programming relaxationto provide bounds in a branch-andbound algorithm. The Lagrangian approach offers a number of important advantages over linear programming relaxations. The Lagrangianrelaxation concept is first formulated in general terms and its use is then demonstrated extensively on a numerical example. 293
Chapter 9
294
II.
CONCEPTS
Consider an integer programming problem of the following form:
I
2 = max[cx]
Ax 5 b
DX 5 e x 2 0 and integral
where x is the n x 1 vector and the elements of x are integers, b is the m x 1 vector, e is of order k x 1, and all other matrices have conformable dimension. We assume that the constraints of (p) have been partitioned into the two sets Ax 5 b and DX 5 e so that (p) is relatively easy to solve if the constraint set Ax 5 b is removed. To create the Lagrangian problem, we first define an m vector of nonngative multipliers u and add the nonnegative term u(b - Ax) to the objective function of (P) to obtain equation 9.1:
+
Maximize cx u[[b- A x ] Subject to : Ax 5 b Dxse x 2 0 and integral.
’ I
It is clearthat the optimal value of this problem for u fixed at a nonnegative value is an upper bound on Z because we have merely added a nonnegative term to the ejective function. At this point, we create the Lagrangian problem by removing the constraints Ax 5 b to obtain Z,(u) = Max[cx]+ u[b - Ax]
Subject to: DX c e x > 0. Since removing the constraints A x 5 b can not decrease the optimal value, ZD(u) is also an upper bound on 2. Moreover, by assumption the Lagrangian problem is relatively easy to solve. There are three major questions in designing a Lagrangian relaxation-based system, and some answers can (roughly speaking) be given. Table 9.1 summarizes these issues. A numerical example is used to illustrate considerations (a), (b), and (c) as well as to compare Lagrangian relaxation to the use of linear programming to obtain bounds for use in a branch-and-bound algorithm.
ation
Lagrangian
295
TABLE 9.1 Several issues related to a Lagrangian relaxation-based system. Question
Answer
a: Which constraints should be relaxed?
The relaxation should make the problem significantly easier, but not too easy. b: How to compute good multipliers u There is a choice between a general-purpose procedure called the subgradient method and "smarter" methods that maybe better but which are, however, highly problem specific. c: How to deduce a good feasible The answer tends to be problem solution to the original problem given specific. a solution to the relaxed problem.
111.
THE SUBGRADIENT METHOD FOR SETTING THE DUAL VARIABLES
Thesubgradient method is demonstrated by an illustrative example. Consider the following problem.
+
Maximize Z = 16x1 lox2 Subject to : 8x1 + 2 ~ 2+x3 + 4 ~ 4= 10
+ 4x4
x1 +.x2 I1, x3 +x4 5 1 . The numerical example is used to develop and demonstate a method for obtaining dualvariable values that produce atight bound. Ideally, u should solve the following dual problem, ZD = Min Zd(u) for u 2 0. Before presenting an algorithm forthis problem, it is usefulto develop some insight by trying different values for the single duel variable u in the example. We use ZD(u)= max( 16 - 8u)xl
+ (1 0 - 2 u ) q + (0 - u)x3 + (4 - 4u)x4 + 1Ou. (e)
Chapter 9
296
Take u = 0 Here we have: ZD(U)= max( I6 - 8*0)x1 + (10 - 2*0)x2
+ (0 - O)x3 + (4 - 4*o)x4 + 10*0.
Thus: ZD(U)- max{16x1
+ 10x2 + Ox3+ 4x4).
Note that x1 has a coefficient largerthan that of x2, and x4 has a coefficient larger than that of x3. x1
= 1 and
x3 = 0
and
= 0 satisfy xI + x2 5 I x4 = 1 satisfy x3 + x4 5 1.
x2
By substitution we get: ZD(0) = 20. Also from (a): 2 = [16x1
+ 10x2 + 4x41 = 20.
We now test constraint (b): 8x1
+ 2x2 + x3 + 4x4 = 12,
which is a violation. Therefore, this solution is not feasible. It is usefulto thinkof the single constraint (b) that we have dualizedas a resource constraint with the right side representingthe available supply of some resource and the left side the amount of the resource demanded in a particular solution. We can then interpret the dual variable u as a price charged for the resource. It turns out that ifwe can discover a price for which the supply and demand for the resource are equal,then this value will also give a tight upper bound. However, such a price might not exist. With u = 0, we discover that the Lagrangian relaxation solution demand for the resource exceeds the available supply by two units, suggesting that we should use a larger value for u. Take u = 6 Here we have:
+
+
+
ZD(u) = max(l6 - 8*6)~1 (10 - 2*6)x2+ (0 - 6 ) ~ 3 (4 - 4*6)~4 10*6. Thus: ZD(6)= max{-32xl - 2x2 - 6x3 - 20x4}+ 10*6.
ation
Lagrangian
297
All coefficients are negative, and maximization takes place with all variables set to zero.
+
x1 = 0 and x2 = 0 satisfy x1 x2 5 1 x3 = 0 and x4 = 0 satisfy x3 + x4 5 1.
By substitution we get: ZD(6) = 60. Also from (a):
+
+
Z = [ 16x1 lox2 4x4] = 0.
We now test constraint (b) 8x1
+ 2x2 + x3 + 4x4 = 0.
This solution is feasible. For u = 6 we discover that we have overcorrected in the sense that all variables are zero in the Lagrangian solution and none of the resource is used. We nexttry a sequence ofdual values in the interval between 0 and 6. Take u = 3 Here we have:
ZD(u)= max(l6 - 8*3)x1+ (10 - 2*3)x2
+ (0 - 3)x3+ (4 - 4*3)x4+ 10*3.
Thus: ZD(3) = max(-8xl
+ 4x2 - 3x3- 8x4) + 10*3.
The variables with negative coefficients are set to zero. x1 = 0 and x3 = 0 and
xz = 1 satisfy x I +xz 5 1 x4 = 0 satisfy x3 x4 5 1.
By substitution we get: ZD(3) = 34.
Also from (a):
+ lox2 + 4x4]= 10.
2 = [16x1
We now text constraint (b): 8x1
+ 2x2 + x3 + 4x4 = 0.
This solution is feasible.
+
Chapter 9
298
Take u = 2 Here we have:
+
ZD(2) =max[(l6- 8*2)x1 (10 - 2*2)x2 + (4 - 4*2)X4 10*2].
+
+ (0 - 2)X3
Thus:
+
+
Zd(2) = max{Oxl 6x2 - 2x3 - 4x4) 10*2 x1 = 0 and .x2 = 1 satisfy x1 x2 5 1 x3 = 0 and x4 = 0 satisfy x3 x4 5 1.
+ +
By substitution we get: ZD(2) = 26. Also from (a):
+
2 = [16~1 10x2 + 4 ~ 4 = ] IO.
We now test constraint (b): 8x1
+ 2x2 + x3 + 4x4 = 2 5 10.
This solution is feasible. Take u = 1 Here we have: ZD(1) = max( 16 - 8*l)xl
+ (10 - 2*1)x2+ (0 - l)x3 + (4 - 4*1)x4+ 10* 1).
Thus: ZD(1) = max{8xl
+ 8x2 - x3 - Ox4}+ 10*1.
We definitely have x3 = 0, since its coefficient is negative. Take x1 = 1 x1 = 0 and x2 = 1 satisfy x1 x3 = 0 and x4 = 0 satisfy x3
By substitution we get: ZD(1) = 18. Also from (a): 2 = [l6x1
+ 10x2 +4x4] = 16.
+ x2 5 1 + x4 5 1.
Lagrangian Relaxation
299
We now test constraint (b):
8x1
"2x2+x3 "4x4 = 8 5 10.
This solution is feasible. We still have another option: x3 = 0
and x4 = 1 satisfy x3
+ x4 5 1.
By substitution we get: ZD(1) =
18.
Also from (a):
+
Z - [16~1 10x2
+ 4x41 = 20.
We now test constraint (b):
8x1 + 2x2 + x3 + 4x4 = 12. This solution is not feasible.
Take x1 = 0 x1 = 0
x3 = 0
+ +
requires x2 = 1 to satisfy xl x2 5 1 requires x4 = 0 to satisfy x3 x4 5 1.
By substitution we get:
2,(1) = 18. Also from (a):
+
Z = [16~1 10x2
+ 4x41 = IO.
We now test constraint (b):
8x1 + 2x2 + x3 + 4x4 = 2 5 10. This solution is feasible. There still exists another option: x3 = 0 and
x4 = 1
satisfy
By substitution we get: Z,(1) = 18. Also from (a): Z = [16xl
+ lox2+4x4J= 14.
We now test constraint (b): 8x1
+ 2x2 + x3 + 4x4 = 6 5 10.
This solution is feasible.
x3
+ x4 5 I .
Chapter 9
300
In the case of u = 1, wesee that there arefour alternative optimal Lagrangian solutions. Table 9.2 gives a list of seven values for u, together with the associated Lagrangian relaxation solution, the bound ZD(u),and Z for those Lagrangian solutions that arefeasible in (2). For the values tested, the tightest bound of 18 was obtained with 21 = 1, but at the moment we lack any means for confirming that it is optimal. It is possible to demonstrate that 18 is the optimal value for &(u) by observing that ifwe substitute any x into the objective function for the Lagrangian problem, we obtain a linear function in u. We use: ZD(u)= max( 16 - 8u)xl
+ (10 - 224) x2 + (0 - U)X~+ (4 - 4u)xq + 1Ozr.
Take xl = x2 = x3 = x4 = 0 ZD(u)= max( 16 - 8u)O
+ (10 - 2u) 0 + (0 - u)O + (4 - 4u)O + 1Ou
= IOU.
Takexl = l , x 2 = x 3 = X 4 = 0 ZD(u)= max(l6 - 8u)l = 16 + 211.
+ (10 - 2u) 0 + (0 - tr)O + (4 - 4u)O + 1Ou
TABLE 9.2 A list of u values with the associated Lagrangian relaxation solution, the bound ZD(u),and Z for those Lagrangian solutions.
U
X1
x2
x3
1
0 0
0 0 0 0 0 0 0 0 0 0
0 0 0
1/2 314
1 1
1 1
0 0
0 0
1 1
1
0 0
1
x4
1 0 0 0 0
1 0
1 1 1
Value of Lagrangian Solution ZD(u)feasible) Z (if
20 60 34
26 18
18 18 18 19 18.5
0
10 10 16
10 14
ation
Lagrangian
301
&(u) = max(l6
= 10
- 8u)O
+ 8u.
ZD(u)= max(l6 - 8u)O = 14 4u.
+
+ (10 - 224) 1 + (0 - u)O + (4 - 4u)O + 1 0 ~
+ (10 - 2u) 1 + (0 - u)O + (4 - 4u)O + 1Ou
Takexl = x 4 = 1 , x 2 = x 3 = 0 ZD(u)= max(l6 - 8u)l = 20 - 22.4.
+ (10 - 2u) 0 + (0 - u)O + (4 - 4u)l + 1Ou
Figure 9.1 exhibits this family of linear functions forall Lagrangian relaxation solutions that are optimal forat least one value of u. The fact that we must maximize the Lagrangian objective means that for any particular value of u, ZD(u) is equal to the largest of these linear functions. Thus, the ZD(u) function is given by the upper envelope of this family of linear equations, shown as a darkenedpiecewise linear function in Figure 9.1. From this figure it is easy to see that u = 1 minimizes Z,(U). Figure 9.2 also provides motivation for ageneral algorithm forfinding u. As shown, the Z,(U) function is convexand differentiable except at points where the Lagrangian problem has multiple optimal solutions. At differentiable points,the derivative ofZ,(U) withrespect to u isgivenby 8x1 2x2 x3 4x4 - 10, where x is an optimal solution to (LR,). These facts alsohold in general with the gradientof the ZD(u) function at differentiable points given by (Ax - b). These observations suggest that it might be fruitful to apply a gradient method to the minimization of ZD(u) with some adaptation at the points where ZD(u) is nondifferentiable. This has been properly achieved in a procedure called the subgradient method. At points where ZD(u) is nondifferentiable, the subgradientmethod selects arbitrarily from theset of alternative optimal Lagrangian solutions and uses the vector (Ax - b) for the solution as though were it the gradientof ZD(u). The result is a procedure thatdetermines a sequence of values for uby beginning at an initial point uo and applying the formula
+
+ +
Chapter 9
302 60 1
zD(u)
50 40
30 20
1
2
3
4
5
U
6
FIGURE 9.1 Family of linear equations for Lagrangian relaxation solution.
In this formula, tk is a scalar stepsize and xk is an optimal solution to (LR:), the Lagrangian problem with dual variables set to uk. The nondifferentiability also requires some variation in the way the stepsize isnormally set in a gradient method. IV. SElTlNG fk
To gain insight into a sensible procedure for setting t k we discuss the results of the subgradient method applied to the example with three different rules for t k . For the example we have: uk+ I = max(0, uk - tk( 10 - 8 ~ : ~ 2) 4 4 - 4xt’)] (9.2) Case 1: Subgradient Method with
fk = 1 for
all k
In this first case tk is fixed at one on all iterations. As a result the formula is:
Lagrangian Relaxation
303
18
14
10
6
2
1
2
3
4
5
6
U
FIGURE 9.2 Composite behavior of ZD(u)featuring its convex nature and itsnondifferentiability property.
1/E+l
= max(0, uk - (10 - 8xik)- 24'') - 44'))).
We start with uo = 0.
Recall the table entry for u = 0. U
X1
x2
x3
x4
1
0
0
1
~~
0
u' = max(0,O - (10 - 8x1 - 2x0 - 4x1))
= max(0, -(-2)}
24' = 2.
Chapter 9
304
Recall the table entry for u = 2.
u2 = max{0,2 - (10 - 8x0 - 2x1 - 4x0))
= max(0, -6)
u2 = 0. We now use values for u = 0 as before,
u3 = 0 - (-2) = 2, and use values for u = 2 as before, u4 = max[O, 2 - 81 = 0. We see that thesubgradient method oscillates between the values u = 0 and u = 2. Case 2: Subgradient Method with fk = 1,0.5,0.25,. .. In this second case tk converges to 0 with each successive value equal to half the value of the previous iteration.
Therefore, k Uk+l - max(0, u -
[3‘( 10 - 8 ~ : -~ 2xf) ) -
4xf’)).
We start once again with uo = 0.
Here we have u1
to =
1, and we get the same result as in the preceding case.
= 0 - (-2) = 2.
With u = 2, we get
2
0
1
0
0
Lagrangian Relaxation
305
u2 = max{O, uI2 - 11/21' (10 - 8x0 - 2x1 - 4x0)) = max{0,2 - 0.5 (10 - 2)) = max{0,2 - 4)
u2 = 0.
Proceeding similarly, we get
1 3 u4 = - - (1/8)(-2) = 7 2
7 8
15
- 16'
In this case, the subgradient method behaves nicely and converges to the optimal value of u.
Case 3: Subgradient Method with
tk=l,
1/3,1/9,. . .
In this final case, ?k also converges to zero, but more quickly. Each successive value is equal to one-third the value of the previous iteration,
Therefore, Uk+1 = max(0, uk -
We start once again with
[3
k
(10 - 8 ~ :-~2xr) ) - 4xik))},
Chapter 9
306
uo = 0 u1 = o - ( - 2 ) = 2
u3 = 0 - (k)(-2)-u4 = 2 - (&)(-2)
2 9 = 0.296
u5 = 0.296 - (&)(-2)
= 0.321
u6 = 0.321 - (&)(-2)
= 0.329
' 4 2= 0.329 - (-&)(-2)
= 0.332.
In this case the subgradient method converges to u = 1/3, showing that if the stepsize converges to 0 too quickly, then the subgradient method will converge to a point other than the optimal solution. From these examples we suspect that the stepsize in the subgradient method should converge to 0 but not too quickly. These observations have been confirmed in a result (see Held et al. [ 5 ] ) that states that if as k -+ 03, t k -+
o
k
and E t , "+ 03 i= 1
then ZD(& converges to its optimal value ZD. note that Case 3 actually k violates the second condition since ti + 2, thus showing that these conditions are sufficient but not necessk-\. A formula for tk that has proven effective in practice is
In this formula, Z* is the objective value of the best known feasiblesolution to (P) and h k is a scalar chosen between 0 and 2. Frequently the sequence h.k is determined by starting with h.k = 2 and reducing h.k by a factor of two whenever ZD(uk) has failed to decrease in a specified number of iterations.
ation
Lagrangian
307
Justification for this formula, as well as many other interesting results on the subgadient method, is given in Held et al. [5]. The feasible value Z* initially can beset to 0 and then updated using thesolutionsthatare obtainedon those iterations inwhich theLagrangian problem solution turns out to be feasible in the original problem. Unless we obtain a uk for which ZD(uk) = Z*, there is no way of proving optimality in the subgradient method. To resolve this difficulty, the method is usually terminated upon reaching a specified iteration limit. Other procedures that have been used for setting multipliers are called multiplier-adjustment methods. Multiplier-adjustment methods are heuristics for the dual problem that are developed for a specific application and exploit some special structure of the dual problem in that approach. The first highlysuccessfulexampleof a multiplier-adjustment method was Erlenkotter’s B]algorithm for the uncapacitated location problem. By developing a multiplier-adjustmentmethod specifically tailored for some problem class, one isusually able to improve on the subgradient method. However, because thesubgradient method iseasy to program and has performed robustly in a wide variety of applications, it is usually at least the initial choice for setting the multipliers in Lagrangian relaxation. Returning to our example, we have obtained through the application of Lagrangian relaxation and the subgradient method a feasible solution with a value of 16 and an upper bound on the optimalvalue of 18. At this point, we could stop and be content with a feasible solution proven to be within about 12% of optimality, or we could complete the solution of the example to optimality using branch-and-bound, with bounds provided by ourLagrangianrelaxation.Inthe next section we showhowsuch an approachcompares with moretraditional linear programming-based branch-and-bound algorithms.
V.
COMPARISON WITH LINEARPROGRAMMINGBASED BOUNDS
In this section we compare Lagrangian relaxation with the upper bound obtained by relaxing the integrality requirement on x and solving the resulting linear program. Let ZLp denote the optimalvalue of (P) with integrality on x relaxed. We start by comparing ZLpfor theexample with the best upper bound of 18 obtained previously with Lagrangian relaxation. To facilitate this comparison, we first write out the standard LP dual of the example. Let u, V I , and v2 denote dual variables on constraints 9.5 through 9.8 and Wj a dual variable on the constraint xj 5 1. Then the LP dual of example 9.4 through 9.9 is
308
Chapter 9
+ + + + w2 + w3 + w4 + + +
Min 1Ou v1 v2 w l 8 u + ~ l W I z 16 2u 211 w2 >_ 10 u+v2+w3>0 4u 212 w4 2 4 u, VI, v2, w1, . . . , "4 2 0.
+ +
The optimal solution to the primal LP is xl = 1, x2 = 0, x3 = 0, x4 = l/2, and the optimal solution to the dual LP is u = 1, v1 = 8, v2 = w1 = . . . = w4 = 0. In order to verify that each of these solutions is optimal, we simply substitute them in the primal and dual and observe that each is feasible and gives the same objective value 18. This exercise has demonstrated two interesting facts: ZLp = 18, the same upper bound we obtained with Lagrangian relaxation; and the LP dual variable valueof u = 1 on constraint 9.6 is exactly the value that gave the minimum upper bound of 18 on the Lagrangian problem. These observations are part of a pattern that holds generally and is nicely summarized in a result from Geofrion [2] which states that ZD 5 Z L p for any Lagrangian relaxation. This fact is established by the following sequence of relations between optimization problems.
+
= Min{Max(cx u(b - Ax))} u>O D n ~ e x 2 0 and integral, ZD
by LP duality Max cx Ax 5 b DX 5 e x > 0. Besides showing that
ZD
5
ZLp
the preceding logic indicates when Z D
= ZLp and when ZD e ZLp. The inequality in the sequence of relations
connecting ZD and ZLp isbetween the Lagrangian problem and the Lagrangian problem with integrality relaxed. Hence, we can have ZD ZLp only if this inequality holds strictly or, conversely, Z D = ZLp only if the Lagrangian problem is unaffected by removing the integrality requirement on x. In the Lagrangian problem for the original example, the optimal values of the variables will be integer whether we require it or not. This implies that we must have Z D = ZLp, something that we have already observed numerically.
ation
Lagrangian
309
This result also shows that we can improve the upper boundby using a Lagrangian relaxation in which the variables are not naturally integral.
VI. AN IMPROVEDRELAXATION An alternative relaxation for the example is given below.
ZD(V~, 212) = Max( 16 - q)x1 + (10 - vI)x2 (0 - ~ 2 ) ~ 3(4 - Q ) X ~ VI
+
+ +
+
(9.10) 212
(9.1 1) O t x j < 1,
j = 1, ...,4
(9.12)
xj integer,
j = 1, ... ,4.
(9.13)
In this relaxation, we have dualized constraints 9.12 and 9.13 and obtained a relaxation that is a knapsack problem. Although this problem is known to be difficult in the worst case, it can be solved practically using a variety of efficient knapsack algorithms such as dynamic programming. Because the continuous andinteger solutions to the knapsack problem can differ, the analytic result obtained in the previous section tells us that this relaxation may provide bounds that are better than linearprogramming. This is confirmed empirically in Table 9.3, which shows the application of the subgradient method to this relaxation. Webeginwith both dual variables equal to zero andin four iterations,we converge to a dual solution in which the upper boundof 16 is equal to the objective value ofthe feasible solution obtained when we solve the Lagrangian problem. Hence, Lagrangian relaxation has completely solved the original problem. In this TABLE 9.3 The subgradient method applied to improved relaxation
Feasible withZ = 4 1 0 0 0 13
1 1
Feasible withZ = 16 1 0 1 0 1 11 1 0
0
1 0 17
0 0
0 1
26
1 0
0 0
0 0
26 16
310
Chapter 9
example we have set the stepsize using the formula given previously with hk = 1. This example illustrates that with careful choice of which constraints to dualize, Lagrangian relaxation can provide results that are significantly superior to LP-based branch-and-bound. The choice of constraints is to some extent an art muchlike the formulationitself. Typically, one will construct several alternative relaxations and evaluate them, both empirically and analytically, using the result on the quality of bounds presented in the previous section. The alternative relaxations can be constructed in one of two ways: begin with an integer programming formulation and select different constraints to dualize, or alternatively, begin with some easy to solve model such as the knapsack or shortest-routeproblem which is closeto, but not exactly the same as, the problem one wishes to solve. Then try to add a set of side constraints to represent those aspects of the problem of interest thatare missingin the simpler model. A Lagrangian relaxation canbe obtained by dualizing the side constraints that have been added.
VII. SUMMARY OF CONCEPTS Up to this point, the concept of Lagrangian relaxation has been developed piecemeal on an example.Wecannowformulate and present a generic Lagrangian relaxation algorithm. Figure 9.3 shows a generic Lagrangian relaxation algorithm consisting of several major steps. The first step is the standard branch-and-bound process in which a tree of solution alternatives is constructed with certain variables fixed to specified values at each node of the tree. These specified valuesare passed from block A to block B together with Z*, the objective value of the currently best known feasible solution, and starting multipliers uo. In blocks B and C , we iterate between adjusting the multipliers with the subgradient, updating the formula to obtain a new multiplier value uk, and solving the Lagrangian problemto obtain anew Lagrangian solution xk. The process continues until we either reach an iteration limit or discover an upper bound for this node that is less than or equal to the current best known feasiblevalue Z*. At this point we passback to block A the best upper bound we have discovered together with any feasible solution that may havebeenobtained asa result ofsolving the Lagrangianproblem. In Fisher’s experience, it is rare in practice that the Lagrangian solution will be feasible in the original problem (P). However, it is not uncommon that the Lagrangian solution will be nearly feasible and can be made feasible with some minor modifications. A systematic procedure for doing this can be applied in block C and constitutes what might be calleda “Lagrangian heuristic.” Lagrangian heuristics have been vital to the computational success of
Relaxation
Lagrangian
IA
I
CONSTRUCTION OF
Upper bound and possibly feasible solution.
I
Node of thetrees Z* - best known feasible value uo - initial multiplier value
k=O I
B
ADJUSTMENT OF MULTIPLIERS
- if k= 0 go to blockC
- ifzD(u”)I Z*
or iteration limit reached, return to block A - otherwise set uk+‘= Max (0,uk- tk (b - Ax’)} k=k+ 1
Update Z* if theLagrangian Solution xkis feasiblein primal problem FIGURE 9.3 A genericLagrangianrelaxationalgorithm.
many applications, such as those described in Fisher [l], and may well prove to be as important asthe use of Lagrangians to obtain upper bounds. It is not uncommon in large-scaleapplications to terminate the process depicted in Block B before the branch-and-bound tree has been explored sufficiently to prove optimality. In this case the Lagrangian algorithm is really a heuristic with some nice properties, such as an upper bound on the amount by which the heuristic solution deviates from optimality.
VIII. PASTAPPLICATIONS
A brief description of several instances in which Lagrangian relaxation has been used in practice should give the flavor of the kinds of problems for which Lagrangian relaxation has been successful.
312
Chapter 9
Bell et al. describe the successful application of the algorithm to Air Products and Chemicals, which has resulted in a reduction of distribution cost of about $2 million per year. Fisher et al. [ 11 discuss the application in the Clinical Systems Division of DuPont of an algorithm for vehicle routing that is based ona Lagrangian relaxation algorithm for the generalized assignment problem. Graves and Lamar [4] treat the problem of designing an assembly by choosing from available technology a group of resources to perform certain operations. The choices coverpeople, single-purpose machines,narrow-purpose pickplace robots, and general-purpose robots. Their work has been applied in a number of industries, including the design of robot assembly systems for the production of automobile alternators. Graves [3] has also discussed the use of Lagrangian relaxation to address production planning problems from an hierarchical perspective. Mulvey [9] is concerned with condensing a large database by selecting a subset of “representative elements.” He has developed a Lagrangian relaxation-based clustering algorithm that determines a representative subset for which the loss in information is minimized ina well-defined sense. He has used this algorithm to reduce the 1977 U.S. Statistics of Income File for Individuals maintained by the Office of Tax Analysis from 155212 records to 74762 records. The application described in Shepardson and Marsten [6] involves the scheduling of personnel who must work two duty periods, a morning shift and an afternoon shift. Their algorithm determines optimal schedules for each worker so as to minimizecost and satisfystaffing requirements. Helsinki City Transport has applied this algorithm to bus crew scheduling. Van Roy and Gelders discuss the use of Lagrangian relaxation for a particular problem arising in distribution. Ineachof the appliations described above, developmentof the Lagrangian relaxation algorithm required a levelofinvolvement on the part ofskilled analysts that is similar tothat required in the use of dynamic programming. Just as someinsight into a problem is required before dynamic programming can be applied fruitfully, it is generally nontrivial to discover a Lagrangian relaxation that is computationally effective. Moreover, once this has been done, the various steps in the algorithm must be programmed more or less from scratch. Often this process can be made easier by the availability of an “off the shelf’ algorithm for the Lagrangian problem if it is a well-knownmodel, such as a network flow, shortest route, minimum spanning tree, or knapsack problem. Despite the level ofeffort required in implementingLagrangian relaxation, the concept is growing in popularity because the ability it affords to
” _
Relaxation
Lagrangian
313
exploit special problem structure oftenis the only hope forcoping with large real problems. For the future itremains to be seen whether Lagrangian relaxation will continue to exist as a technique that requires a significant ad hoc development effort or whether the essential building blocks of Lagrangian relaxation will find their way into user-friendly mathematical programming codes such as LINDO or IFPS OPTIMUM. Such a development could provide software for carrying out Steps A and B in the generic flowchart as well as a selection of algorithms for performing Step C for the most popular easy to solve models.It would then be left to the analystto decide which constraints to dualize and to specify which of the possible Lagrangian problem algorithms to use.
IX. SUMMARY A.
Overview
Thebranch-and-bound technique for solving integer programmingproblems is a powerful solution technique despite the computational requirements involved. Most computercodes based on the branch-and-boundtechnique differ from standard known procedures in the details of selecting the branching variable at a node and thesequence in whichthe subproblems areexamined. These rules are based on heuristics developed through the basic disadvantage of the branch-and-bound algorithm, whichis the necessity of solving a complete linear program at each node. In large problems, this could bevery time consuming, particularly when the only information needed at the node may be its optimum objective value. This point is clarified by realizing that once a good bound is obtained, many nodes can be discarded as their optimum objective values are then known. The preceding point led to the development of a procedurewhereby it may be unnecessary to solve all the subproblems of the branch-and-bound tree. The idea is to estimate an upper bound and assume a maximization problem on the optimum objective valued at each node. Should this upper bound become smaller than the objective associated with the best available integer solutions, the node is discarded. B. Algorithm of Solution Using Lagrangian Relaxation Approach Consider an integer programming problem of the following form. 2 = max cx
Chapter 9
314
Subject to: Ax 5 b DX 5 e X50
and integer, where x is n* 1, elements of x are integers, b is nz* 1, e is k* 1, and all other matrices havecomformable dimensions. The algorithm is summarizedas the following. Step I Weassume that the constraints of the problems have been partitioned into two setsA x 5 b and DX 5 e so that the problem is relatively easy to solve if the constraint set A x 5 b is removed. Step 2 To create the Lagrangian problem we first define an m vector of nonegative multipliersU and addthe nonnegative term U(b - Ax) to the objective function to form &(U),
ZD(U) = Max cx + U(b - AX) Subject to: DX 5 e x 2 0 and integral.
It is clearthat the optimal value of this problemfor U fixedat a nonnegative value is an upper bound on 2 because we have merely added a nonnegative form to the objective function. Step 3 The new objective ZD(U) is nondifferentiable at points of the Lagrangian problem solution. This led to the use of the subgradient method for minimization of &(U) with someadaptation at the points where &(U) is nondifferentiable. Assuming we begin at a multiple value Uo, then Uk+' = max(0, U k - tk(b - Ax')}, tk
is a stepsize, and xk is an optimal solution to LRUk. The stepsize, tk should converge to zero but not too quickly; that is,
i= 1
zD(uk) converges to its optimal value. A formula for tk is given by
Relaxation
Lagrangian
315
(9.14)
Z* is the objective value of the best known feasible solution and h.k is a scalar chosen between 0 and 2, and reducing Ak by a factor of two whenever ZD(Uk)has failed to decrease in a specified number of iterations. The feasible value Z* initially can be set to (0) and then updated using the solutions obtained onthose iterations in which the LR solution must be feasible in the original problem. Step 4 Continue the procedure in Step 3 until we either reach an iteration limit or discover an upper bound for this node that is less than or equal to the best known feasible value Z*. C.
Power System Application: Scheduling in Power Generation Systems
1. The Model
A simplified version of the power scheduling problem is formulated as a mixed integer programming problem having a special structure that facilitates rapid computation. First, reserve and demand constraints areincluded in the basic model. Then other constraints areincluded in the formulation withoutaffecting the basic structure of the problem. The integer variables in the model indicate whether a specified generating unit is operating during a period. xir = 1 +generating unit i is operating in period t Xir = 0 +generating unit i is off i = l , ..., I t = 1 , 2 ,... T . The continuousvariable yikr represents the proportion of the available capacity M i k that is actually used at period t: k = 1 , .. . ,ki.
kj +number of linear segments of the production cost curve > 0 only if Xir =1; hence this constraintis introduced: 0 5 yik 5 Xir.
Ykit
The total energy output from generator i at time t is given by (9.15)
mi = Minimum unit capacity
Chapter 9
316
Mi = Maximum unit capacity wi = 1 If the unit is started up at time t wi = 0 Otherwise zit = 1 If the unit is shut down in periot t zit = 0 Otherwise ci = Startup cost of generator i gi = Operating cost of unit i at its minimum capacity for 1 hour h, = Number of hours in period t .
The objective function to be minimizedfor thebasic T-period scheduling model is (9.16)
Let D,represent the demand level at time t . Then the demand constraint is given by (9.17) k= I
Let the reserve Rt be the minimum quantity at time t . Then the reserve constraint is given by
2
Mixi, 2 R,.
(9.18)
i= 1
By imposing only the constraints
then the objective function to be minimized w i t , zit must equal either zero or 1. The mixed integer model for the basic power scheduling problem is Minimize i=l
{ 5:{ t=l
ciwit
+ dizit + h,gixi, +
k,
Mikhigikyikt k= I
Subject to: (9.20) i
k= I
Lagrangian Relaxation
317
(9.21) (9.22) (9.23) (9.24) (9.25) (9.26) (9.27)
2. Relaxation and Decomposition of the Model
Lagrangian relaxation is used to decompose the problem into I single generator subproblems. The advantage of decomposing by generator is that the constraints and costs that depend on the state of the generator from period to period can easily be considered in the subproblems. The solution of the relaxed problem provides a lower bound on the optimal solution of the original problem. The Lagrangian relaxation model is:
(9.28)
(9.29)
where ut, ut are nonnegative real numbers (Lagrange multipliers).
Chapter 9
318
The Lagrangian relaxation problem decomposes into I single generator subproblems of the form Minimize
(9.30)
Subject to:
Csetof constraints. Figure 9.4 shows a graph that could be used for the solution of the subproblems in the basic model. The upper state in each period represents the ON state and the lower state represents the OFF state for the generator. The transition arcs on the graph represent feasible decisions and the arc lengths are the costs associated with the decision. Ifthe generator is off inperiod t - 1 then Xi,t-l = 0, and if it is on in period t , xit = 1. Consequently, wit >_ 1, zit 0,O 5 yikt 5 1 to minimize the cost. Given the values of and xit we set wit = 1, zit = 0, Yikt = 0 if gikht - ut 2 0 and yikt = 1 if gikHt - ut < 0. The lengths of other arcs can be determined in a similar way. A path through the graph specifies an operating schedule for the generator and the problem of finding the minimum cost schedule becomes a shortest path problem on a cyclic state graph. By expanding the state graph, it is also possible to represent certain aspects of the real problem that were omitted from the basic model. The state graph of the model is shown in Figure 9.5 The state model in Figure 9.5 can accommodate many extensions such as
ON
a’
@ -
FIGURE 9.4
Basic modelstate graph.
Relaxation
Lagrangian
319
up
DOWN 1
DOWN 2
DOWN 3 FIGURE 9.5 Graph with time-dependent start-up.
1. Time-dependent start-up cost 2. Minimum up and down constraints 3. Generator availability restriction. The time-dependent start-up costs are added to themodel by varying the costs on the transition arcs thatlead from a down state to an up state. The minimum up- and down-times are enforced by eliminating some of the transition arcs in the state graph. For example, if a generatormust be off for at least three time periods, the transition arcs from the down-state 1 tend down two states to the upstates in the figure. Two would be eliminated.
3.The Solution Technique The backboneof the technique is a branch-and-bound procedure that builds an enumeration tree for the zero-one variable xu. Each node in the tree is characterized by a set of xit variables with fixed values. For the basic model the problem represented by a node in the enumeration tree has the form of the basic problem with the appropriate set of the xit variables fixed. The Lagrangian relaxation of the problem at each node is solved to obtain a lower boundon the optimalsolutionfor the problem at the node. To obtain the solution at a node a simple shortest-path algorithm is used at the node to solve each of the I single generator subproblemsusing dynamic programming.
Chapter 9
320
X.
ILLUSTRATIVEEXAMPLES
Example 1 Consider the multidivisional problem
+
+
Maximize z = 10x1 5x2 8x3 Subject to: 6x1 5x2 4x3 6x4 5 40 3x1 +x2 5 15 x1 x2 5 10 x3 2x4 5 10 xj 1 0.
+
+
+ 7x4
+
+ +
1. Explicitly construct the complete reformulated version of this problem in terms of the pjk decision variables that wouldbe generated as needed and used by the decomposition principle. 2. Use the decomposition principle to solve this problem. Solution Step I
In matrix notation, the problem is:
Maximize 2 = cx Subject to: Ax 5 b DX 5 e, where x = [x1,x2, x3, x41t and c = [lo, 5,8, 7It, with 6
[i By partitioning this set of constraints, we obtain Ax 5 b
DX 5 e, where A=[6
5
4 61,
b=40]
Lagrangian Relaxation
321
3 1 0 0 D = [ l 0 01 01 02 1 ,
and e=[::]
Step 2 The Lagrange is defined as
+
ZD(U)= Max(cx) u(b - Ax) .*.ZD(U) = 10x1 5x2 8x3 7x4 Subject to : Dx 5 e, which is
+ + + + ~ ( 4 -0 6x1 - 5x2 - 4x3 - 6x4)
Step 3 Assume Uk= 1 for k = 0. Then
Maximize ZD(U~) = 10x1 5x2 = 4x1 4x3
+ + 8x3 + 7x4 + 40 - 6x1 - 5x2 - 4x3 - 6x4 = + + x4 + 40,
such that DX5 e. The solution to this integer programming problem yields x0 = [,
, , 1' and ZD(u0) =
.
Step 4 Iterative process.
XI.
CONCLUSIONS
This chapter discusseda Lagrangian problem in which the complicated constraints werereplacedwitha penalty term in the objective function involving the amount of violation constraints and their dual variables. The Lagrangian relaxation concept and setting method were first discussed in Sections I1 to IV. A comparison with an LP-based bound was presented in Section V, followed by an improved relaxation concept. Practical applications such as for power systems were presented in Sections VI1 to IX.
Chapter 9
XII.
PROBLEM SET
Problem 9.1
Problem 9.2 Maximize z = 3xl Subject to: 3x1 2x2 5 18 XI 5 4 xj 2 0.
+ 5x2
+
Problem 9.3 Apply the decomposition principle to the following problem. Maximize z = 6x1 Subject to:
+ 7x2+ 3x3+ 5x4 + x5 + x6
+ + + + x5 +
x1 x2 x 3 x1 +x2 5 10 x2
x4
x6
5 50
58
+ x4 5 12 x6 >_ 5 x5 + 5x6 5 50 5x3
x5
+
xj >_ 0.
Problem 9.4 Solve the following problem using the decomposition algorithm. Maximize z = loxl + 2x2+ 4x3 + x4 Subject to: XI 4x2 - x3 2 8
+
Lagrangian Relaxation
323
+ + + + +
2x1
x2 x3 2 2 3x1 x4 x5 2 4 x1 2x4 - x5 2 10 xj 2 0.
1. Fisher, M. 2. 3. 4.
5. 6.
7. 8.
9.
L. The Lagrangian Relaxation Method for Solving Integer Programmming Problems, Management Science, Vol. 27, no. 1,1981, pp. 1-18. Geoffrion, A. M. Lagrangian Relaxation and Its Uses in Integer Programming, Mathematical Programming Study Vol. 2, 1974, pp. 82-1 14. Graves, S. C. Using Lagrangian Techniques to Solve Hierarchical Production Planning Problems, Management Science, Vol. 28, no. 3, 1982, pp. 260-275. Graves, S. C . andLamar, B. W. AnIntegerProgrammingProcedurefor Assembly System Design Problems, Operations Research, Vol. 31, no. 3 (MayJune 1983), pp. 522-545. Held, M. H., Wolfe, P., and Crowder, H. D. Validation of Subgradient Optimization, Mathematical Programming, Vol. 6, no. 1,1974, pp. 62-88. Muckstadt, J.A. and Koenig, S. An Application of Lagrangian Relaxation to Scheduling in Power Generation Systems, Operations Research, Vol. 25, no. 3, May-June 1977. Shepardson, F. and Marsten, R. E. A Lagrangian Relaxation Algorithm for the Two Duty Period Scheduling Problem, Management Science, 1980. Shepardson, F. and Marsten, R. E. Solving a Distribution Problem with Side Constraints, European Journal of Operations Research, Vol. 6, no. 1, (January), pp. 61-66. Mulvey, J. Reducing the US Treasury’s taxpayer data baseby optimization, Interfaces, Vol. 10, No. 5, (October), pp. 101-1 12, 1980.
This Page Intentionally Left Blank
Chapter 10 Decomposition Method
1.
INTRODUCTION
In large-scale systems, a special class of linear programming problems is posed as multidimensional problems and is represented by the decomposition principle, a streamlined versionof the LP simplex method. The decomposition principle has special characteristic features in that its formulation exploits certain matrices with distinct structures. These matrices, representing the formulated problems, are generally divided into two parts, namely, one with the “easy” constraints and the other with the “complicated” constraints. The partitioning is done such that the desired diagonal submatrices and identity matrices are obtained in the reformulation of the problem. Now, in the decomposition principle (credited to Danztzig and Wolfe [SI), the method enables large-scale problems to besolvedby exploiting these special structures. Therefore, we note that the decomposition method can be used for any matrix A in the formulation: Minimize
z = CTX
Subject to: AX = b. However, the method becomes more vivid when the matrix A has a certain structure, as explained in the next section. 325
Chapter 10
326
II. FORMULATION OF THEDECOMPOSITIONPROBLEM Consider the linear programming problem Minimize Z = cTx Subject to: Ax=b xi 3 0 for V i E {l,N}, where Z = Scalar objective function cT = Coefficient vector of objective function A = Coefficient matrix of the equality constraints b = Vector of the inequality constraints x = Vector of unknown or decision variables. If matrix A has the special property form of a multidivisional problem, then by applying a revised simplex method, we start with A2 - A1 . . AN 0 . . 0 AN+! AN+2 . . 0 0
“Complicated” Constraints
A=
“Easy” Constraints
-
0
0
. .
A2N
It should be noted that some AN+1blocks are empty arrays. Vector b is also partitioned accordingly into N 1 vectors such that
+
b = [bo, bl, . . . ,b,IT. Similarly, vector c is also partitioned into N row vectors to obtain
c = [CO, c1, . . . ,CNIT. Similarly, we have x = [x(), X I , .
. ., X N ]T .
Therefore, the problem takes the form: N Minimize Z = Ej=1 cjxj Subject to: Ljxj = bo A j ~= j bj
Decomposition Method
xj 2 0
327 Vj
E
{l,N),
and the submatrices AN+j correspond to Lj. Notably, each of the constraints A . x . = bj defines the boundary of a J. convex polytope Si thereby greatly reducing the computational effort. The set of points {xj) such that xj 2 0 and A, Xj < bj constitutes a convex set with a finite number of extrema points. These points represent the cornerpoint feasible solution for thesubproblem with these constraints. Then any solution xj tothe “easy” subproblem j that satisfies theconstraints Ajxj = bj, where xj =.0, can also be written as
where xij is assumed to be known and AVx, = 1, with A, 5 0, V j E { 1, N) A i E { 1, Sj). It is further assumed that the polytope so formed by the subset of constraints Ajxj = bj contains Sj vertices and is bounded. This is not the case for any Xj that is not a feasible solution of the subproblem. Suppose xu is known, then let Lii E Lj x,
and
C, z Cjxu,
such that the problem can be reformulated with fewer constraints as N
Minimize 2 =
s,
CijAV
Subject to:
with SI i= 1
AV20
V j E ( 1 , N ) and
ViE{l,S,}.
This formulation is a transformation from the partitioned problep to a mj to revised problem that has reduced the number of rows from m, m, + N rows. However, it has greatly increased the number of &ables from
+2
Chapter 10
328 N
N
j= 1
j= 1
Fortunately, we do not have to consider all the xu variables if the revised simplex method is to be used.
111.
ALGORITHM OF THE DECOMPOSITION TECHNIQUE
This section introduces a typical decomposition algorithm. The solution steps are as follows.
Step 1. Reformulate the linear programming problem into N linear programming subproblems and let A’ represent the matrix of constraints and c‘ represent the vector of objective coefficients. Step 2. Initialization. Assume x = 0 is a feasible solution to the original problem. Set j = 1 and-;x = 0, where j E { 1, N ) and determine the basis matrix B and the vector of the basic variable coefficients CB in the objective function. Step 3. Compute the vector c j ’ A’ - c‘ and set pd to the minimum value (use the revised simplex method). Step 4. Compute the vector corresponds to pu using:
(zjk - cjk)
for all k = 1,2, . .. , q that
where
mo = Number of elements of bo x/k = Corner-point feasible solution for the set of constraints given by xj 2 0 and AN+jxj 5 bj ( B - l ) l , m= o Matrix of the first p, columns of B-I (B-l)l,nlo+j = Matrix of the ith column of B-I.
Step 5. Use an LP approach to solve for the optimal Wj in the new problem that is given by
+
~ c-)xj , , ~ ~ cA ~~ ( B - ~ ) ~ , , ~ ~ ~ + ~ Minimize W, = ( c ~ ( B - ~ ) Subject to: xj 2 0 and 5 bj.
Decomposition Method
329
Step 6. Obtain W;, theoptimal objective value of Wj, whichis Wj = Min(zjk - cjk) for all values of k. The corresponding optimal solution is x7k = X j . Step 7. Determine the coefficient of the elements of x, thatare nonbasic variables as elements of C B ( B ” ) ~ , ~ ~ . Step 8. Optimality Test. IFcoefficients all of are nonnegative, THEN the current solution is optimal. Go to Step 10. Otherwise find the minimum of the coefficients of ~ B ( B - ~and ) ~select , ~ ~the~ corresponding entering basic variable. IF the minimum of the coefficients of C ~ ( R ” ) ~=, p ~j k ,~ THEN identify the value of x;k and the original constraints of Step 9. Repeat Step 3 for all j
E
{ 1, N } .
Step 10. Apply the revisedsimplex optimal solution.
method andobtainthe
final
Step 11. Print/display final solution and end. Notably, under the assumption thatx = 0 is a feasible solution to the original problem, the initialization step utilizes the corresponding solution as the initial point or as theinitial basic feasible solution. That is, we select x, as the initial set of basic variables along with one variable pjk for each of the subproblems j , where j E { 1 , N } such that xjk = 0. Finally, successive iterationsare performed until theoptimalsolution is foundandthe “best” value of pjk is used to replace the value of xj for the optimal solution to conform with that of the original problem.
IV.ILLUSTRATIVEEXAMPLE TECHNIQUE
OF THE DECOMPOSITION
Consider the problem Maximize 2 = 4xi Subject to:
and
+ 6x2 + 8x3 + 5x4
Chapter 10
330
j E ( 1 , 2 .... 4).
Xj>O,
Solution
In the reformulated problem, the partitioned A matrix that reflects the “easy” and “complicated” constraints is:
-
1 2
3 3
: :
2 6
4 4
............... A
=
l
l 2
1
i I
O o
O o
............... o
o
i
4
3
Therefore, N = 2 and
‘1
A3=[’ 1 2 ’
A4
=[4
31.
In addition, CI = [4
61,
~2
= [8
51,
To prepare for demonstrating the solution to this problem, we first examine its two subproblems individually and then the reformulation of the overall problem. Subproblem I
Maximize 2, = [4 6 ] [ Subject to:
x2
]
Decornpositlon Method
331
It can be seenthat this subproblem has fourextreme points (nl = 4). One of these is the origin, considered the “first” of these extreme points, so
where pll, p12,~ 1 3 ~,
1 are 4
the respective weights on these points.
Subproblem 2
[”1 x4
Maximize Z2 = [8 51 Subject to:
Its set of feasible solutions is:
where p21,p22, ~ 2 are 3 the respective weights on these points. By performing the cj.jX vector multiplications and the Ajxik matrix multiplications, the following reformulated version of the overall problem can be obtained.
Subject to:
and 2 0 for k = 1,2,3,4 p2k 2 0 for k = l , 2 , 3 Xsi 2 0 for i = 1,2.
plk
Chapter 10
332
However, we should emphasize that thecomplete reformulation normally is not constructed explicitly; rather, just parts of it are generated as needed during the progress of the revised simplex method. To begin solving thisproblem the initialization step selects x s l ,xS2,p11, and pI2 to be the initial basic variables, so that
'1
,I!'[
Therefore, since AlxTI = 0, A2x;I = 0, clxT1= 0, and c2x;I = 0, then 1 0 0 0 B = [ O0 0 1 0
=B",
XB=b'=
0 0 0 1
c B = [ O 0 0 01,
for the initial basic solution. To begin testing for optimality, let j = 1, and solve the linear programming problem. Minimize WI = (0 - cl)xl + 0 = "4x1 - 6x2 Subject to: A3xl 5 bl and xI 3 0. The optimal solution of this problem is
such that W;"= -26. Next, let j = 2 and solve the linear programming problem. Minimize W2 = (0 - c2)x2 Subject to: A4x2 5 b2 and x2 2 0.
+ 0 = -8x3
- 5x4
The solution of this problem is
such that W i = -24. Finally, since none of the slack variables are nonbasic, no more coefficients need to be calculated. It can now be concluded that because both W;" 0 and W; < 0, the current basic solution is not optimal. Furthermore, since W;"is the smaller of these, p13 is the new entering basic variable.
Decomposltlon Method
333
For the revised simplex method to now determine the leaving basic variable, it is first necessary to calculate the column of A' giving the original . column is coefficients of ~ 1 3 This
Proceed in the usual way to calculate the current coefficient of p i 3 and the right-side column,
By considering only the strictly positive coefficients, the minimum ratio of the right side to the coefficient isthe (1/1) inthe third row, so that r = 3; that is, p l l is the new leaving basic variable. Thus the new values of X* and CB are:
By using a matrix inversion technique to find the value of B", we obtain such that
BiJw = The current basic feasible solution is now tested for using the optimality conditions of the revised simplex method. In this case Wl = (0 - q)x1 26 = -4x1 - 6x2 26 and the minimum feasible solution of this problem is
+
+
This yields W,*= 0.0. Similarly, W2 = (0 - cZ)x2 that the minimum solution of this problem is
+ 0 = - 8 ~ 3 - 5x4
such
Chapter 10
334
This yields W; = -24. Finally, since none of the slack variables are nonbasic, no more coefficients need to be calculated. It can now be concluded that because both W; -= 0, the current basic solution is not optimal, and p22 is the new basic variable. Proceeding with the revised simplex method.
This implies that
Therefore, the minimum positive ratio is (12/18) in the second row, so that r = 2; that is, x,2 is the new leaving basic variable. The new inverse of the B matrix is now:
(+)
'l
(A)
O 0
0
(+)
,O
[E]
(l)
lJ
-
Xe=
and CB = [0, 24, 26, 26, 01.
Now test whether the new basic feasible solution is optimal.
Decomposition Method
335
1 -3 0 -1 18 0 0 1 0 18 20 1
"
"
+[0,
24, 26,
]0!1[
13
4 26 - - j x ' - 2xz+ T . Therefore, the feasible solution is:
X1 =
[:] x;j, =
with Wt =
(:).
Similarly,
1 = 5x4, and the minimum solution now is:
and the corresponding objective value is W; = 0.0. Finally, W; >, 0.0 and W; 3 0, which means that the feasible solution is optimal. To identify this solution, set
Chapter 10
336
(?)
(%) 1
(E) r
-
l
4
and
Thus an optimal decision variable for the problem is x1 = 2, x3 = 3, x3 = 2, and x4 = 0. The correspondingvalue of the objective function is 2 = 42.
V.
CONCLUSIONS
This chapter discussed the decomposition method for a special class of LP multidimensionalproblems. Formulation of the decompositionproblem wasshowninSection TI. The algorithm of the decompositiontechnique and an illustration exampleof the decompositionmethodwere givenin Sections IT1 and IV.
VI.
PROBLEM SET
Problem 10.1
Decomposition Method
337
Problem 10.2 Consider the multidivisional problem
+
+
Maximize z = loxl 5x2 8x3 + 7x4 Subject to: 6x1 5x2 4x3 6x4 5 40 3x1 +x2 5 15 x1 x2 5 10 x3 2x4 5 10 Xj 0.
+
+
+
+
+
Use the decomposition principle to solve this problem. Problem 10.3 Maximize z = 3x1+ 5x2 Subject to: 3x1 2x2 5 18 x1 5 4 x j 3 0.
+
Problem 10.4 Apply the decomposition principle to the following problem.
Problem 10.5 Indicate the necessary changes in the decomposition algorithm in order to apply it to minimization problems. Then solve the problem: Maximize z = 5x1+ 3x2+ 8x3 - 5x4 Subject to: x1 x2 x3 x4 2 25
+ + +
Chapter 10
338
Problem 10.6 Solve the following problem using the decomposition algorithm. Maximize z = 10x1 Subject to: x1 4x2 - x3 2 8 2x1 x2 x3 2 2 3x1 i-X4 i-x5 L 4 x1 2x4 - x5 2 10 xj 2 0.
+ 2x2 + 4x3 + x4
+
+ +
+
VII. REFERENCES 1. Bell, Earl J. Primal-Dual Decomposition Programming, U.S.G.R.&D.R. Order AD-625 365 from CFSTI, OperationsResearch Center, University of California, Berkeley, August, 1965. Decomposition Variant Equivalent to Basis 2. Birge, J. R.ADantzig-Wolfe Factorization, Mathematical Programming Study 24, R. W. Cottle, ed., 1985, pp. 43-64. 3. Glickman, T. and Sherali, H. D. Large Scale Network Distribution of Pooled Empty Freight Cars Over Time, withLimited Substitution and Equitable Benefits, Transportation Researclz,B (Methodology),Vol. 19, no. 2, 1985, pp. 8594. 4. Himmelblau, D. M. Decomposition of LargeScale Problem, North-Holland, Amsterdam, 1973. a Network, 5. Hu, T. C. Decomposition Algorithm for Shortest Paths in Operations Research, Vol. 16, no. 1, (Jan.-Feb. 1968), pp. 91-102. 6. Lasdon, L. S. Duality and Decomposition in Mathematical Programming, IEEE Transactions on Systents Science andCybernetics, Vol. 4, no. 2, 1968, pp. 86-100. 7. Lemke, C. E. The dual method for solving the linear programming problem, Naval Research Logistics Quarterly 1, No. 1, 1954. 8. Ritter, K. A Decomposition Method for Linear Programming Problems with Coupling Constraintsand Variables, MRC Report No. 739, Mathematics Research Center, U. S. Army, University of Wisconsin, Madison, April, 1967. 9. Dantzig, G . B., Orden,A. and Wolfe, P. Generalizedsimplex method for minimizing a linear form under linear inequality restraints, Pac$c J. Math, 5, 1955, pp. 183-1 95.
Chapter 11 Optimal Power Flow
I. INTRODUCTION The idea of optimal power flow was introduced in the early 1960s as an extension of conventional economic dispatch to determine the optimal settings for control variables while respecting various constraints. Theterm is used as a generic name for a large series of related network optimization problems. The development of OPF in the last two decades has tracked progress closely in numerical optimization techniques and advances in computer technology. Current commercial OPF programsareable to solve very large and complexpowersystems optimization problems in a relatively short time. Many different solutionapproaches have been proposedto solve OPF problems. For OPFstudies, the power systemnetwork is typically modeledat the main transmission level, including generating units. The model may also include other auxiliary generating units and representation of internal or external parts of the system are used in deciding the optimum state of the system. In a conventional power flow the values of the control variables are prespecified. In an OPF, the values of some or all of the control variables need to be found so a to optimize (minimize or maximize) a predefined objective. The OPF calculation has many applications in power systems, real-time control, operational planning, and planning. OPF is available in most of today's energy management systems (EMS). 339
Chapter 11
340
OPF continues to gain importance due to the increase in power system size and complex interconnections. For example, OPF must supply deregulation transactions or provide information on whatreinforcementis required. The trade-offsbetweenreinforcements and control options are decided by carrying out OPF studies. It is clarified when a control option maximizes utilization of an existing asset (e.g.,generation or transmission), or when a control option is a cheaper alternative to installing new facilities. Issues of priority of transmission accessand VAr pricingor auxiliary costing to afford fair price and purchases can be done by OPF. The general OPF problem is posed as minimizing the general objective function F(x, u) while satisfying the constraints g(x, u) = 0 and h(x, u) 5 0, where g(x, u) represents nonlinear equality constraints (power flow equations) and h(x, u) is nonlinear inequality constraints on the vectors x and u. The vector x contains the dependent variables including bus voltage magnitudes and phase angles and the MVAr output of generators designed for bus voltage control. The vector x also includes fixed parameters, such as reference bus angles, noncontrolled generator MW, MVAr and outputs, noncontrolled load on fixed voltage, line parameters, and so on. The vector u consists of control variables involving: Active and reactive power generation Phase-shifter angles Net interchange Load MW and MVAr (load shedding) DC transmission line flows Control voltage settings LTC transformer tap settings Line switching. Table 1 1.1 shows a selection of objectivesand constraints commonly found in OPF formulation. The time constants of the control process are relatively long, allowing the OPF implementation to achieve optimality adequately. The quality of the solution depends on the accuracy of the model studied. It is also important that the proper problem definitionwithclearly stated objectivesbegiven at the onset. No two-powersystem companies have the same type of devices and operating requirements. The model form presentedhereallows OPF development to easilycustomize its solution to different cases under study. II. OPF-FUEL
COSTMINIMIZATION
Fuel cost minimization is primarily an operational planning problem. It is also a useful tool in planning functions. This is usually referred to as eco-
Optimal Power Flow
TABLE 11.I
341
Some objectives and constraints commonly found in OPF
1 Active Power ObJectlves
Economicdispatch(minimum cost losses, MW generation or transmission losses) Environmentaldispatch Maximumpowertransfer 2. Reactive Power ObJectives (MW and MVAr Loss MInimizatlon) 3. General Goals
1 Limits on Control Variables
Generator output in MW Transformer taps limits Shuntcapacitorrange 2. Operating Limitson
Line and transformer flows (MVA, Amps, MW, MVAr) MW and MVAr interchanges MW and MVAr reserve margins (fixed/dynamic) Voltage,angle(magnitude,
Minimumdeviation from atarget schedule Minimum control shifts to alleviate violations parameters 3. Control Use of engineering rules to Least absolute shift approximation offer more controls for of control shift handling violation 0 Control effectiveness (more control with sufficient effects) 0 Limit priorities engineering preferable operating limit enforcement (cost benefit) 0 Control rates change and trajectories 0 Voltage stability 4. Local and nonoptimized
controls (generator voltage, generator real power, transformer output voltage, MVAr, shunt/SVC controls) 5. Equipment gangingand
sharing Tapchanging Generator MVAr sharing Controlordering
difference)
Chapter 11
342
nomic dispatch, the aim of which isto obtain theactive power generation of the units committed for operation, such that the total fuel cost is minimized while satisfying operational feasibility constraints. A.
Modeling Issues
Fuel cost minimization requires knowledge of the fuel cost curves for each of the generating units. An accurate representation of the cost curves may require a piecewisepolynomial form, or can be approximated in several ways, with common ones being:
1. piecewise linear 2. quadratic 3. cubic 4. piecewise quadratic. A linear approximation is not commonlyusedwhile the piecewise linear form is used in many production-grade linear programming applications. A quadratic approximation is used in most nonlinear programming applications. Control variables are usually the independent variables in an OPF, including: 1. active power generation 2. generator bus voltages 3. transformer tap ratios 4. phase-shifter angles 5 . values of switchable shunt capacitors and inductors. I
The use of all of the above as control variables should give the best (least expensive) solution. For a regular OPF, the usual constraints are the: 1. network power balance equations at each node, 2. boundson all variables, 3. line-flow constraints,and 4. others such as transformer tap ratios of parallel transformers.
However, this may not be the most desired solution depending on certain other factors such as additional constraints. The following assumptions are made in modeling the objectives and constraints. 1. Fuel cost curves are smooth and quadratic in nature; 2. only active power generations are controlled for cost minimization. Transformer tap ratios, generation voltages, shunt capacitors, and inductor positions are held at their nominal set values throughout the optimization;
"
Optlmal Power Flow
343
3. current flows arecontrolledapproximately using voltage and phase angle restriction across the lines; and 4. contingency constraintsare neglected. B. Mathematical Description of the Objective Functions and Constraints for Cost Mlnimfzation The objective function is given by the following fuel cost model. N*
F(Pg) =
C(ai + i=
apg1
+ YiPZ,),
(1 1.1)
1
subject to equality constraints representing the active and reactive electric network balance, Pi - Pg,
+ Pd, = 0
Qi-
Qgi
i = 1 , . . . , Nb +Qd,
=O
i = 1,
(11.2)
...,Nb
(1 1.3)
i E gen/synch where Nb
pi = viX ~ Y ~ C O -oj S(O - ll.v) ~
i = 1,.
. .,N~
(11.4)
j= 1
ei= V
Nb
~ ~ ~ Y ~ S ~ ~ ( O ~ i = - 1,O..., ~ Nb, - * ~ )
( I 1.5)
j= 1
togther with the inequality constraints, Vimin 5 Vi 5 vimaxi= 1, .. . , Nb
(1 1.6) (11.7) (1 1.8)
(1 1.9) (11.10)
Chapter 11
344 =
= = = = = =
= =
= = = =
= =
=
111.
Active power generation at unit i Fuel cost parameters of unit i Number of dispatchable generation units Number of PV buses, including generators and synchronous condensers Total number of buses Total number of lines Voltage magnitude at buses i and j Phase angles at buses i and j Net active power injections at node i Net reactive power injection at node i Magnitude of the complex admittance matrix element at the ith row and jth column Phase angle of the complex admittance matrix element at position i ,j Maximum allowable current flow in branch I Lower and upper bound on the voltage magnitude at bus i Conversion factors to convert the maximum allowable current flow to an appropriate maximum allowable voltage and phase angle difference across the ends of the line I Lower and upper bounds on the reactive generation at bus i.
OPF-ACTIVEPOWER
LOSS MINIMIZATION
Activepower loss minimization (referred toas loss minimization) is usually required when cost minimization is the main goal with control variables being active generator power outputs. When all control variables are utilizedin a cost minimization(such as is reasonable when contingency constraintsare included), a subsequent loss minimization will not yield further improvements.When cost minimization is performedusing only the active power generations as control variables, a subsequent loss minimization computation using a different set of control variables can beusefulin obtaining a better voltage profile and lower current flow along the lines. This will involvelessriskoflow voltage insecurities during contingencies as well as a lowerriskof current flow constraint violations during contingencies. The primary application of loss minimization in in operations, similar to cost minimization. In planning, loss minimizationcanbe a useful tool in conjunction with a
Optimal Power Flow
345
plannng objective, providing more secure optimal solutions for planning purposes. This isespeciallyusefulin studies that neglect contingency constraints. Loss minimization can be graphically represented as shownin Figure 1 1. l , which demonstrates that the process attempts to minimize the square of the distance between two voltage vectors connected across a transmission line.Wesee in the figure that in loss minimization, both magnitudeand phase angle of voltage vectors across each line are minimized. Thereare two basic approachesto loss minimization, namely, the slack bus approach and the summation of losses on individual lines. The slack bus approach is by far theleast complicated approach, where the slack bus generation is minimized.The objective function is linear in this case and can be handled by any linear or nonlinear programming method. The disadvantage of this approach is that it can only minimize the total active power loss of the system. It is sometimes desirable to minimize the losses in a specific area only, and the above approachmay not be applicable to this type of situation. The second approach does not have the disadvantage mentioned earlier, but is more involved computationally. Theobjective function turns out to be more complicated when expressing voltages in polar form.When using the rectangular form, the objective function issimplified to a quadratic form. The polar form can be used in NLP methods. A quadratic form is preferred and the rectangular form of voltage representation is utilized. Due to the need for optimizing certain geographic areas,thesummation approach is implemented.
Imaginary
t Real axis
FIGURE 11.I
Graphical representation of loss minimization.
Chapter 11
346
A.
Modeling Issues for Loss Minimization
In loss minimization, the usual control variables are:
1. Generator bus voltage magnitudes 2. Transformer tap ratios 3. Switchable shunt capacitors and inductors 4. Phase-shifter angles. Out of these,a great deal of control can be achieved by usinggenerator bus voltages and transformer tap ratios as control variables. Phase-shifter angles are normally used to alleviate line overloads. Since loss minimization indirectly takes care oflineflowsvia the objective,line overloads are expected to be at a minimum. Active power generations are usually not employed as control variables in order to minimize changes to the economic dispatch solution for an integrated implementation. In the formulation for loss minimization, generator voltages and transformer tap ratios are used as control variables. Transformer tap ratios are treated as continuous variables during the optimization, after which theyare adjusted to the nearest physical tap position and reiterated holding the taps at the adjusted values. This approximation is justified based on the small stepsize usually found in transformers. The constraints for loss minimization, as well as for other objectives described, are similar to those discussed earlier for cost minimization. The following assumptions are made in the formulation of the loss objective. Loss minimization is done following a cost minimization, and thus the active power generations excluding the slack bus generation are held at their optimal values. 2. Generator bus voltages and transformer tap ratios are used as control variables. Shunt reactances and phase-shifter angles where available are held at nominal values. 3. Transformer tap ratios are treated as continuous variables during the optimization, after whichthey are adjusted to the nearest physical tap position and reiterated. 4. Current flows are controlled approximately, using restrictions on the real and imaginary components of the complex voltageacross the lines. 5 . Contingency constraints are neglected. 1.
B. Mathematical Description of the Objective Functions and Constraints for Loss Minimization
The objective functions to be minimized is given by the sum of line losses
Optimal Power Flow
347
Individual line losses Plk can be expressed in terms of voltages and phase angles as
This expression involves transcendental functions. Transforming the above to equivalent rectangular form, we have: (1 1.13)
which simplifies to
The objective function can now be written as N
(11.15)
This is a quadratic form and is suitable for implementation using the quadratic interior point method, where PI5
=
'lk
=
Total active power loss Active power loss in branch k gk = Series conductance of line k ei,J;: = Real and imaginary components of the complex voltage at node i ej,& = Same as ej,hfor nodej G,, B, = Real and imaginary components of the complex admittance matrix elements tkpl rkp2 = Transformertapratiosfortransformers in parallel NPI = Number ofsuch transformersintheith set of parallel transformers ti,min, timax = Lower and upper bounds on the transformer tap ratio at the ith transformer kel kh = Equivalent of k, and k, in rectangularform Kelmin, = Conversion factors, to convert the lower voltage bound to an equivalent rectangular form Ke,min,K/;,max = Same as above for upper bound El, = Shunt conductance of line I on side i (i.e.,half total) gri = Series conductance of line (connected to node i) 1
9
Chapter 11
340
5n
Transformer tap ratio mth of transformer Equivalent susceptanceof El, and gl, bi Shunt reactance at node i 4 = lines connected to node i NIl = Total number of li gn19 b n , = Series conductance and susceptanceof transformer m Nr = Total number of transformers. The constraints areequivalent to those specified in Section I1 for cost minimization, with voltage and phase angle expressed in rectangular form. The equality constraints are given by !I,
=
3
b,
= =
(11.16) (11.17) where (11.18)
Nb
Qi
Nb
= ei x ( G o e j - B A ) +J; j= 1
j=1
(Gofj + Boej), i=t ....,N b
(11.19)
and also, tkp, -
rkpz= O
kp2 =
1,.
kpl
k = 1, . . . ,Npi
. . , Npi
(I 1.20)
# kp29
i E sets of parallel transformers, for parallel transformers. The inequality constraints are given by, vimin
Ivi L
vimax
i = 1, ..., Nh
(1 1.21)
P g 1mi" 25 f'&
5 Pg,,,,
i E Slackbus
(1 1.22)
Qgrmln 5 Qgi
IQglmflx
i = 1, ..., Ngq
(1 1.23)
Optimal Power Flow
349
(1 1.26)
Note that equation 11.21 is not linear in the rectangular formulation.It may be linearized or an approximate linear form may be used as given below. Vimin
c ei IK , ,
max
Vimax
i=1
9
9
Nb
(1 1.27)
(1 1.28)
v,?=e:+h2
i = 1 , ..., N ~ .
(1 1.29)
The transformer tap ratio t controls the optimization via the admittance matrix. The relationship is as follows.
(1 1.30)
IV.OPF-VArPLANNING The application of VAr planning as the name implies is in power system planning. It is aimed at minimizing the installation cost of additional reactive support necessary to maintain the system in a secure manner. The planning priorityis to minimize cost and alsoto minimize future operations costs. This is necessary since cost of equipment and apparatus can be prohibitive in achieving an overall cost-effective planning scenario. VAr planning involves identification of accurate VAr sites and measurable quantitiesof reactive sources to achieve system security. The analysisinvolves modeling to account for the discrete nature of the reactive power. This must generally be done using curve fitting or planning experience before it can simulate an optimum decision process. The results obtained from VAr planning allow indirect sites, sizing, and costingof components. Theextensive calculations required of the siting and sizing elements are normally done using OPF concepts. The traditional computations necessary for this purpose involve mathematical programming techniques-linear and nonlinear programming. In the VAr planning process, we conduct voltage optimization which, being similar to loss minimization, is handled in much the same way. Loss
Chapter
350
minimization can be classified as a vector form of voltage optimization. In voltage optimization, the aim is to maintain the system voltage magnitudes as close as possible to a nominal voltage, such as oneper unit. This is pictorially represented in Figure 1 1.2. At the optimal solution, the voltage vectors stay within a narrow band close to nominal voltage values as shown by the shaded area in the figure. Hence we see that the endpoints of two voltage vectors are broughtcloser to one another in a radial sense only, as opposed to loss minimization, where the endpoints are brought closer together in a vectorial sense. The application is mainly for operations with potential for use in planning when combinedwith a suitable planning objective. We present here a model, its formulation, and the associated algorithms for VAr planning.
A.
Modeling Issues for VAr Planning Type I Problem
VAr installation costs are usually modeled as linear functions. The inductive and capacitive components of the VArs may be combined and modeled as one piecewise linear function as shown in Figure 1 1.3. With minimum and maximum values given, the existing VAr at a given site can be: 1. All capacitive 2. All inductive 3. Both inductive and capacitive 4. None. The capacitive and inductive VArs can also be modeled as separate variables as shown in Figure 11.4.
FIGURE 11.2 Descriptionof voltageoptimization.
Optimal Power Flow
351
cost CAPACITIVE
INDUCTIVE
Min. Installed
Max. Max. Min. Existing Installed Existing
+ VAr
FIGURE 11.3 Costcurve for VAr support.
The modeling of capacitive and inductive VArs as separate variables is usefulwhen including contingency constraints in the formulation. For example, a contingency involving a loss of load may require inductive compensation at a site while a certain line causing low voltage situations may require capacitive compensation at the same site. This is not facilitated in the representation of Figure 11.3. Combining planning and optimal objectives in VAr planning results in the two-objective optimization problem. The units of the planning objective may be in dollars. The units of the operations objective (such as fuel cost) may be in dollars/hour. Hence, if the objective is not scaled with respect to the other, the result will be meaningless. Some existingpractical approaches are: Cost of inductive VArs
Cost of capacitive , VArs
1
Max. installed CapacitiveVArs
Inductive
existing Max. capacitive VArs
r ,Capacitive VArs
u
Inductive VArs.
existing Max. inductive VArs
VArs
FIGURE11.4 (a) Costcurve for capacitive VAr, and (b) costcurve for inductive VAr.
Chapter 11
352
1. Convert the planning costs to a comparable operations cost using life cycle costing. This is commonly used by the industry. 2. Convert the operations cost to a long-term cost comparable to the life of the installed equipment. Voltage optimization can be performed by minimizing the sum of absolute voltage deviations from a nominal voltage, or by minimizing the sum of squares of the voltage deviations from a nominal voltage. The former is a piecewise linear objective, while the latter is a quadratic form, ideally suited for implementation using a quadratic OPF method such as the quadratic interior point. 8. Mathematical Description of the Objective and Constraints for Type I Problem for VAr Planning
The objective function andconstraintsfor voltage optimization are described in mathematical formfollowing a brief description of the notation not already discussed. L
1. Mathematical Notation
F(v) Vi,nom
= =
Objective function to beminimized Nominal voltage (could be one per unit).
2. Mathematical Description (Voltage Optimization)
(11.31) The objective function to be minimized is given by subject to the equality constraints givenby repeated below with slight modifications.
equations 1 1.31 and 11.32,
(1 1.33) where, (1 1.34) j= 1
353 (1 1.35)
(1 1.36)
(1 1.37) (11.38)
(1 1.39) (1 1.40) (11.41)
C. Type 1.
II Problem for VAr Planning
Control Variables
In designing the VAr/OPF problem, we start with the definition of the control variabIes. The control variables to be used depend on the objective specified. For objective (l), the control variables are:
1. 2. 3. 4.
Generator bus voltages Transformer tap ratios Shunt capacitors (existing and additional) Shunt inductors (existing and additional).
For objective (2), the control variables are: 1.Activepower generations 2. Generator bus voltages 3. Transformertapratios 4. Shunt capacitors (existing and additional). 5 . Shunt inductors (existing and additional).
354
Chapter 11
For objective (2), generator bus voltages and transformer tap ratios will be controlled only when contingency constraints are employed. 2. Constraints
The constraints for VAr planning usually include contingency constraints, since security during contingencies is a primary objective in VAr planning. In some applications, contingencies are considered on a case-by-case basis. The advantage of this approach is the reduction of the problem to several smaller subproblems. The disadvantage is the increased problem size, especially for large systems, as the total amount of VAr support deemed necessary by the common set containedin the individual solutions will not necessarily be the optimal VAr support required. In order to guarantee a trueoptimalsolution,it isnecessary to consider the contingencycases jointly and solve them as one composite problem. 3. Assumptions
A majorconcerninVArplanning is the nature of the variables being optimized. Theshunt inductances and capacitances comein discrete formandthe inclusionof integer variables in the optimization require specialmixed-integerprogramming techniques that do not perform well in nonlinear powersystem applications. Anapproximationcommonly adopted is to assume the variables as continuous during the optimization and clamp them to the nearest physicalvalue at the optimal point. The solution may require reiteration with clamped quantities for a more accurate solution. An alternative method is to move the variable closer to a physically available value during optimization using penalty functions. The penalty parameter is controlled during optimization in such a way as to avoid forcing the variable to a physical value far from the optimal point assuming continuity. The disadvantage of this method is that the penalty function changes the function’s convexity and convergence can occurat a local rather than a global minimum. The first approximation is used in this study. It is assumed that the voltage at generator buses and transformer tap ratios do not change during a contingency. D. Mathematical Description of the Objective and Constraints for Type II Problem for VAr Planning The objective function and constraints for VAr planning are described in mathematical form, following a brief description of notation not already described.
Optimal Power Flow
355
1. Mathematical Notation
VAr objective function to be minimized Total capacitive VAr required at bus i during contingency k = Total inductive VAr required at bus i during contingency k = Existing capacitive VAr sites = Existing inductive VAr sites = Number of capacitive VAr sites = Number of inductive VAr sites = Number of contingency cases = Unit cost of capacitive VAr in dollars = Unit cost of inductive VAr in dollars = Reactive power generation at bus i during contingency k = Voltage at PQ buses during contingency k = Phase angle at all but the slack bus during contingency k = Mean of the optimal qCi,k averaged over all k = =
2. Mathematical Description of VAr Planning
The objective function to be minimized for VAr planning can be mathematically expressed as
(1 1.43)
where a negative value for the expressions inside braces is treated as zero. This formulation, although providing a true mathematicaldescription of the problem (based on the assumptions), has the disadvantage of including discrete variables in the optimization, evenwhen qc and qr are assumed continuous. An alternate solution is to use the average over all k instead of the maximum as shown below.
(1 1.44)
Chapter 11
356
The scalar quantity l/Nk canberemovedfrom the objective. The disadvantage of this method is shown in Figure 1 1.5,wherewesee that the average VAr requirement over all k is kept low, but the maximum is very much larger. The amount of VAr to be installed at bus i is determined by the maximum. Another possible alternative is to minimize the squared average deviations over all values of k . This will minimize large individual deviations from the average. An objection might arise here that the objective is no longer cost, but cost squared. The answer to this is that, unless equation l l.43 is minimized as specified, any other alternative is an approximation, and any approximation that gives the least expensive solution should be the choice for implementation. The objective function for the latter is given by
( I 1.45)
Equation 1 1.44 is linear while equation 1 1.45 is quadratic. (1 1.46) The equality constraints are given by
I
!
I
1
1
I
I
!
1
2
3
4
5
i
6
FIGURE 11.5 Optimalsolution for sitei.
k
Optimal Power Flow
357
Qi.k - Qgi,k + & + q i , k iEgenjsynch
=0
i = 1, ...,Nb
k = 0, ..., Nk.
iEVar
(1 1.47)
where (note that k = 0 specifies the intact system. In cases where k is not specified, k = 0 is assumed.) tkp[
kpl = 1 , . . . , Npi kp2 = 1, . . . , Npi
- zkp2 = O
(1 1S O )
i ef sets of parallel t/f, with the inclusion of parallel transformers.
The inequality constraints are given by Vimin
IVi IVfmax
Vimin
IVi, 0 ,< Vimax
viminc
5
5
Vi,k
Pgimi,Ip g i
rimin
max
i = 1 , . . . , Nb
c
i E PQ buses
L Pgem,,
Qgimin 5 Qgi,k
-kvlZl
Vimax
i = 1,. . . ,Nb
5 Qgi,,,
Irimax
5
vi,k
- Q,k 5 kvfZlmax
-kezrmax IOi,k - ej.k 5 kel
i E pv buses i = 1, ...,Nk Z = 1, . .., Nl
i E pvbuses i E PQ buses k = 1 , . . ., N k k = 1, ...,N k i,j E 1 k = 1, . . . , Nk
max
Note that generator bus voltages, transformer tap ratios, and active power generations are unchanged during contingencies. The transformer tap ratios control the optimization via the admittance matrix as before. We also have i = 1, ..., Nc (1 1.51)
A minimum VAr installation is desired to avoid installation of very small quantities at sites. This can be achieved by clamping the optimal VAr to zero or to the minimum VAr (whichever is closer), in the event that the
Chapter 11
358
optimal VAr is lower than the minimum allowed. After clamping, the solution may require reiteration for a more accurate solution by using
0 I qri.k 5 qrimax
qri,min
(11.52)
1.54)
5 qri.k 5 qri,max.
(1
The control variables used in VAr planning additionally include Vi i = 1,. . . , Ngq P g i i = 1,. . . , Ng. The second objective function is given as F(q, P g ) = F*(q)+
N€!
C(ai+
Bipgi
+ ~lripii)
(11.55)
i= 1
where F*(q) is a modification of F(q) described earlier. The modifications consist of changesto S,i and S,i to convert the planning cost to anequivalent operations cost. Assuming that money is borrowed at interest r, and the life of the quipment is n years, we can write: sci ($/VAr)
=
(1
+ r)ncr($/VAr)/hr
(1 1.56)
%x365 x24
(1 1.57)
Pgi i = 1, ..., N g
i E slack bus.
V.OPF-ADDINGENVIRONMENTALCONSTRAINTS The Clean Air Act Amendments (CAAA) of 1990 require the power industry to reduce its SO2 emissions level by 10 million tons per year from the 1980 level, and its NO, levels by about 2 million tons per year. The SO2 provisions of the Act are to be implemented in phases: Phase I, which began in 1995, required 262 generating units from 110 power plants to limit their SO2 emissions to 2.5 lb/MBTU, andPhase 11, beginning in2000, requires all units to emit under 1.2 lb/MBTU. To prevent utilities from shifting emissions from Phase I to Phase IT units, an underutilization (or "burn") provi-
Optimal Power Flow
359
sion mandates a minimum generation level at the Phase I units. If this provision is not met, the utility concerned has to either surrender allowances proportionally, or designate one or more Phase I1 units as compensating units subject to the same restrictions as Phase I units. Energy conservation measures and unexpected demand reductions are taken into account for the underutilization constraint. A.
Modeling Issues for Environmental Constraint
There are various ways to model environmental constraints. The model adopted here assumes that both SO2 and NOx emissions can be expressed as separable quadratic functions of the real power output of the individual generating units. More specifically, the same heat-rate functions are used for calculating the fuel and each of the emission types. For configuration I, the SO2 emission constraints can be expressed as where, ajHj(Pj)
S=
1.60)
(1
j c@
and
ESmax
=
Q,
=
SO2 upper limit for the power system being analyzed Setof all committed Phase I units, for the ith configuration = Appropriate conversion coefficient for the jth unit Hj(Pj) = Heat rate for unit j , expressed as a quadratic form + bjP, cj Pj = Real power output of unit j for the ith configuration. The NO, can be similarly expressed as
+
Ni(ui) IENmax,
ajc
(11.61)
where
(1 1.62) The underutilization ("burn") constraints are of the form: rjHj(Pj> 2 Bmin,
N =
(1 1.63)
j E@
where Bmin is the required minimum generation at the Phase I units. An important feature of the Act is a provision that allows utilities to trade and bank emission allowances (granted to all Phase I units) with the
Chapter 11
360
proviso that the national upper limit of 10 million tons be met. An annual auction is held to promote trading of allowances. One way to model allowance trading is to add anextra penalty term to the objective function reflecting the current market priceof allowances, and relaxing the maximum number of allowances that can be purchased. Since the emission constraints areall specific to a given configuration, they can be written in the generic form: (1 1.64)
Ei(Zi) 5 0.
VI. COMMONLYUSEDOPTIMIZATIONTECHNIQUE
(LP)
The following requirements need to be met by any contemplated solution techniques for the OPF problem. Reliability. The performanceof OPF calculations mustbe reliable for application in real time. They must converge to realistic answers and if not, then adequate justifications must be provided. The more operationally stressed the powersystemis, the more mathematically difficult the OPF problem is to solve. The acceptance of the OPF industry is based on its reliable performance at all times. Failing to do so, OPF will not gain acceptance.
Speed. The OPF calculations involve computation of nonlinear objective functions and nonlinear constraints with tens of thousands of variables. This therefore requires solution methods that converge fast. Flexibility. The OPF solution methods simulate real-life power system operation andcontrol situations, and new requirements are continually beingdefined for calculations. Therefore, robust and flexible OPF algorithms must accommodate and adapt to wide a range of objectives and constraint models. Maintainability. Dueto new knowledgeofsystemmodels, and perceived priorities of objectives and constraints, an OPF algorithm must include a rule-based scheme and easy to maintain features for real-time application.
A.
LinearProgramming
The LP based-algorithmsolves OPF problems as a successionof linear approximations: Minimize
F(xo
+ Ax.uo + Au)
(1 1.65)
Optimal Power Flow
361
Subject to: $(Xo
+ A.x.u' + Au) = 0
(1 1.66a)
h'(xo
+ Ax.u0 + Au) 5 0,
(1 1.66b)
where xo, uo = Initial values of x and u Ax, Au = Shift about this initial point g', h' = Linear approximations to the original = Nonlinear constraints. The basic steps required in the LP-based OPF algorithm are as follows. Step I . Solve the power flow problem for nominal operating conditions Step 2. Linearize the OPF problem (express it in terms of changes about the currentexact system operating point) by
1. Treating the limits of the monitored constraints as changes with respect to the values of these quantities, accurately calculated from the power flow. 2. Treating the incremental control variables Au as changes about the current control variable values (affected by shifting the cost curves). Step 3. Linearize the incremental network model by
1. Constructingandfactoringthe network admittancematrix (unless it has not changed since last time performed). 2. Expressing the incremental limits obtained in Step 2.2 in terms of the incremental control variables Au. Step 4. Solve the linearly constrained OPF problem by a special dual, piecewise linear relaxationLP algorithm computing theincremental control variables. Step 5. Update the control variables u = u nonlinear power flow problem.
+ Au and solve the exact
Step 6. If the changes in the control variables in Step 4 are below user-defined tolerances, the solutionhas not been reached. If not, go to Step 4 and continue the cycle.
Chapter 11
362
Notably, Step 4 is considered the key step since it determines the computational efficiency of the algorithm. The algorithm solvesthe network and test operating limits in sparse form while performing minimization in the nonsparse part. For Steps 1 and 5, solving the exact nonlinear power flow problem g(x, u) = 0 is required to provide an accurate operating xo. This offers either a starting point for the optimization process or a new operatingpoint following the rescheduling of control variables. The power flow solution may be performed using either the Newton-Raphson (N-R) power flow or the fast decoupled power flow (FDPF) technique. As shown inequation 11.65, the optimization problem that is solved at each iteration is a linear approximation of the actual optimization problem. Steps 2 and 3 in the linear programming-based OPF algorithm correspond to forming the linear network model and expressing it in terms of changes about the operating point. Linearized network constraints models may be derived using either a Jacobian-based coupled formulation given by
[t;]
= J[
$,]
or a decoupled formulation based on flow equations expressed by $A6 = A P
1.69)
(1
(1 1.67)
or AupQ = JAx,
the modified fast decoupled power 1.68)
(1
B"AV = AQ.
The latter is used in most applications of linear programming-based OPF. The linear coupled and decouplednetworkmodels are considered separately. 1. Definition of LP Problem Structure
Let m represent the number of constraints and n represent the number of decision variables. If m > n, the problem is under-solved. If nz = 12, the linearizedproblem has a unique solution and can not be optimized. For the case when n =- m, values may arbitrarily be chosen for n - m of the variables, while valuesfor theremaining variables are then uniquely defined. The m variables are termed the basis variables. The remaining 11 - m variables are termed nonbasis variables. During the course of the solution, variables are exchanged between the basis and nonbasis sets. At any given time, however, exactly m variables must reside in the basis for a problemwith m equality constraints. The objective is to choosevalues for n-m nonbasis variables that minimize Az. The values for the remaining m variables are determined by the solution.
Optimal Power Flow
363
The linear programming algorithm changes only a simple variable in the nonbasis set at a time, and this variable is given the subscript i. The remaining n-m-1 nonbasis variables, denoted by j , remain constant. Hence, A z equations can now be reformulated in terms of the basis set of variables b and a given nonbasis variable i:
+piAXI
=O
BAxb
AX^ = 0,
(11.71) 1.72)
(1
where the subscript b represents the subset of variables termed the basis variables. The variables in this subset change during the solutionprocedure. A variable in xb may be from x , or x,~.The number of variables in Xb is equal to the number of constraints. cb =
xi pi ci
= = =
Subset of incremental costs associated with x b Single nonbasis variable chosen to enter the basis Column of matrix A associated with xi that enters the basis Incremental cost associated with xi.
Similar to the vector x, q can include elements fromboth Therefore
c, and c,.
Initially, xb = x,.
2. Linear Programming Iteration The LP solution procedure is iterative (see Figure I 1.6). Each iteration involves selecting a variable to enter the basis in exchange for a variable leaving the basis. The variable basis is the one that achieves the greatest reduction in A z per unit of movement. The variable leaving the basis is the one thatfirst reaches a limit or breakpoint. Thebasis matrix is then updated to reflect the exchange in variables along with any changes in sensitivities. 3. Selection of Variable to Enter Basis
To select a variable i to enter thebasis, it is necessaryto derive an equivalent cost and sensitivity that represent not only the movement of i itself, but also the simultaneousmovement of all the basis variables. From equation 11.7 1,
Chapter 11
364
-4
COMPUTEll FROM EQUATION
(11.77)
I
COMPUTE C FROM EQUATION(11.76)
YES
NO SELECT THEMOST NEGATIVEC
w SELECT A BASIS VARIABLE TO LEAVETHEBASIS IN EXCHANGE FORTHE ONE ENTERING THE BASIS BY COMPUTING
P
1
m UPDATE THE BASIS MATRIX
FIGURE 11.6 Algorithmic steps for implementing the LP algorithm.
Optimal Power Flow
AXb = -B"pi
365
- AXb = -piAXi
1.73) (1
pi = B-lpi
(1 1.74a)
Bp, = P i -
(1 1.74b)
or Substituting equation 11.74b into equation 11.70 gives the equivalent cost sensitivity ci, with respect to variable i: AZ = (
~i CbB-')AXi
(Ci
- n'pi)Axi = ciAXi
1.75) (1
cj = cj - d P i
1.76)
n = ( B - 1)'cb
1.77a)
(1 (1
or
B'n = cb.
(I 1.77b)
where p1=Negative of the vector of sensitivities between xi and xh ci = Composite incremental cost associated with xi and xb n = Vector of sensitivities betweenthe constraintlimits and the objective function.
B. LinearProgrammingApplicationsin
OPF
Example 1 The linear programming-based OPF problemformulation requires the objective function to be expressed as a set of separable, convex, and continuous cost curves:
In theLP formulation theactive power and thereactive power problems are solved separately. In the active power problems the only controls are active power controls, and correspondingly in the reactive power problems the only controls are reactive power controls. The most common optimal power flow objective function is the active power production cost. The cost curve of a generator is obtained from the corresponding incremental heat rate (IHR) curve. For each segment of the piecewise linear IHR curve, the corresponding quadratic cost curve coefficients are found. This gives a piecewise quadratic cost curve for the unit.A piecewise linear cost curve is obtained by evaluating the piecewise quadratic
366
Chapter 11
cost curve at various MW values and creating linear segments betweenthese points. This piecewiselinear cost curve is what the optimization process uses during processing. The cost curve for interchange with an external company represents the actual cost of exchanging power with that company. It is obtained from the transaction data by arranging the available blocks of power according to increasing cost. No direct economic cost is associated with phase-shifters or load shedding. Thus, artificial cost curves have to be assigned to these controls. The cost curves may be thought of as penalty functions; that is, there is a cost penalty to be paid for moving these controls away from their initial value. The basic shape of the cost curve for aphase-shifter has been represented by (1 1.78) where ui is the variable (the phase-shifter angle), up is the initial value of the control variable, and ki is the cost curve weighting factor. Illustrative Example 2 The active powerminimum control-shift minimization objective aims to limit the rescheduling of active power controls to the minimum amount necessary to relieve all constraint violations. If the initial power flow solution does not involve constraint violations, then no rescheduling is required. This problem is similar in many waysto the cost optimization problem. The control variables that may be usedand the constraints that areobserved are identical to those in the cost optimization problem. The difference is in the cost representing the generator MW outputs and the interchange control variables. These cost curves are now defined as piecewise linear approximations to the quadratic penalty terms. None of the control variables has an actual economic cost. They are all artificial costs. The minimum number ofactive controls minimization objective uses a linear V-shaped curve for each control, with zero value of cost at the target initial control value. In practice, the most sensitive controls are moved one at a timewithin their full available control rangein order to eliminate constraint violations. The result is a minimumnumber of the controls being rescheduled. Illustrative Example 3 The minimum control-shift and minimum number of control objectives can be used for reactive power optimization. The cost curves for the reactive powerminimum-shiftminimization objective are obtained from penalty functions. The minimum number ofreactive controls minimization objective
Optimal Power Flow
367
is the same as for theactive power counterpart except that theV-shaped cost curves are used as the reactive power controls. Illustrative Example 4 In the active power loss minimization problem the active power generation profile is held fixed and the reactive power profile is varied in order to achieve the minimum loss solution. In addition, phase-shifters can be used to change the MW flow pattern insuch a way as to reduce losses. The objective function can be written as F=
R12.
(1 1.79)
This objective function is nonseparable. Since the LP program requires a linearized formulation, the approach then is to minimize the changes in system power losses. The change in system power losses APL is related to the control variable changes by
(1 1.80) The sensitivities in this equation are obtained from a loss penalty factor calculation, using a transpose solution with the Jacobian matrix factors of the coupled model. In this coupled formulation all control variables are represented explicitly andthecorresponding rows and columns arepreborderedtothe Jacobian matrix. This linear approximation is valid over a small region, which is established by imposing limits on the changes in control variables from their current values. This separable linearized objective issubject to the usual linearized network constraints. The linearized region must be sufficiently small relative to the local curvature of the nonlinear transmission loss hypersurface in order to achieve appreciable loss reduction. However, if the region is too small, the solution will require an excessive number of iterations. To cope with this problem a heuristic approach for contracting or expanding the linearized regioncan be invoked. Another difficulty isthat the number of binding constraints may be significantly larger than for any other previously mentioned OPF problem, resulting in prolonging the solution time. Although not best suited for loss minimization, a well-tuned LP algorithm cansuccessfully solvethe problem in a reasonable time.
" "
-
.
"..""."r.
.-
Chapter 11
368
C. InteriorPoint
Since we discussed interior point methods in detail before, we restrict ourselves here to some observations pertinent to the preceding discussion of other general nonlinear programming methods. The current interest in interior point algorithms was sparked by Karmarkar’s projective scaling algorithm for linear programming, which is based on two key ideas: The steepest descent direction is much more effective in improving the iterate Xk if the iterate is at the center of the polytope formed by the linear constraints rather than if it were closer to the boundary. A transformation of the decision space can be found that places the iterate at the center of the polytope without altering the problem. Under certain conditions, one can show that the projective scaling algorithm is equivalent to logarithmic barrier methods, which have a long history in linear and nonlinear programming. This led to the development of Mehrotra’s primal-dual predictor-corrector method, an effective interior point approach. The main ideas in all these barrier methods are: Convert functional inequalities to equalities and bound constraints using slack variables. Replace bound constraints by adding them as additional terms in the objective function using logarithmic barriers. Use Lagrange multipliers to add the equalities to the objective and thus transform the problem intoan unconstrained optimization problem. Use Newton’s method to solve the first-order conditions for the stationary points of the unconstrained problem. Interior point methods can be applied to OPFs by using a successive linear programming technique, and employing an interior point method for solving the linear programs. The other way is to directly apply interior point methods to the NLP formulation using the relation to barrier methods as outlined above. 1. OPFFormulation(Method II)
Objective Function (Loss Minimization)
Minimize PL Subject to: pGi- pni - F ~ ( Ve,, T ) = o i = 1,2, . . . , Nbus i # slack
Optimal Power Flow
369
For this type of OPF (min loss), control variable u=
[3
and dependent variable
Write the OPF problem into thefollowing mathematical programming problem.
1
Min F = - U T G U + R T x + C 2 Subject to:
h(M, X) = 0 g(M, x) i 0 Xmin IX 5 Xmax Mmin 5 M Iurnax. Solve the OPF problem using quadraticprogrammingand/orthe quadratic interior point method (see Figure 11.7). Thus it is necessary to linearize the nonlinear constraints around the base load flow solution for small disturbances. The dependent variable of load flow X can also be eliminated using the implicit functions of the control variable u. First linearize the equality constraint h(u, x), A X = -hi
T
(h,T A M+ h),
and linearize the nonequality constraintg(u, x),
(a)
Chapter 11
370
I
I
InputPowerSystemdata,includingpowerflowandOPFdata
R u n r
1
I
InitialPowerFlowtoobtaintheBaseCasesolution
4 1. Select the Objective Function of OPF. E.g.Loss Minimization
2. Convert it to Quadratic Form I
Determine the control variable U and the dependent variable X e.g. Economic Dispatch,U = [Pg, PL,TITand X = [QG,VD, 8 IT
4 Select the OPF constraints and linearize them about the base case power plow Solution
4 Establish QuadraticProgramming (QP) OPF model
4 ""
ObtaintheQPparameters:
A , R , G , b ,and
c
"
I
4 Compute B~ = XD'
1 i
I
1 Calculate the optimal step sizes and
p2using Steps 4 i d 5 ~
Q
"1
FIGURE 11.7 OPF implementation flowchart by quadratic interior point method.
Optlmal Power Flow
371
Updating: U
'+' = U
+ PD 'dp'
no 4
Run Power Flow and Check the Constraints
Yes
no
Print/Display Optimal Power Flow solution
FIGURE 11.7 (continued)
gLAu ig z A x -I-g 5 0.
Combine (a) and (b), -T (& - gxT h, hG)Au
+ (g - hiTh) L 0.
Thus, the quadratic form of the OPF problem is 1 F =-AU=GAU 2 Subject to:
+ R ~ A U+ c
Chapter 11
372
The quadratic form of the OPF can be expressed as follows. 1 Minimize F = - UTGU RTU C 2 Subject to:
+
+
or 1 -TMinimize F = - U GU 2 AU=Z R = [ R ,0IT,U = [ U , SI,SZ], 8 = [bmin, brnax]
+ ETu+ c
G O O
A
O
I
0 0 0
where I is an indentity matrix. Solve the OPF problem 1 -T-
Minimize F = - U GU 2 Subject to:
+ RT-i7+
using the interior point method and starting at initial feasible point k = 0.
sf tu!,
-k
Step 1. Dk = dia ..., U , ] . Step 2. BIL = ZD . Step 3. dpk = {(Bk)T(B"(B".)T)"& - l}Dk(Giik+ E).
Uo at
Optimal Power Flow
373
Step 4.
-1 r - 4 Pa= r 9
lo6, r 2 0, where r = min{dpT, j Step 5.
E (s,, s2)}.
where T = (Dkdpk)=G(Dkdpk). Step 6. Uk+' = U k
+ PDkdpk,
where P = min(P1,P2}. Set k := k 1, and go to Step 2. End when dp < 8.
+
VII. COMMONLYUSEDOPTIMIZATIONTECHNIQUE (NLP) A.NonlinearProgramming Consider an objective functionf ( X ) ; the negative gradient off (X), -Vf(X), is a direction vector that points towards decreasing values of f ( X ) . This direction is a descent direction for f ( X ) . Disregard the constraints for the moment. That is, assume that the problem is unconstrained. Then, the optimal solution can be obtained using the following algorithm. Assume an initial guess X o . Find a descent direction Dk. Find a step length ak to be taken. Set xk+'= xk a k ~ k . If 11Xk+' - Xkll 5 E , stop. Xk+' is declared to be the solution where E is a tolerance parameter. Step 6. Increment k. Go to Step 2. Step 1. Step 2. Step 3. Step 4. Step 5.
+
1. Finding the Descent Direction
There are several ways to obtain Dk with the simplest being to set Dk equal to -Vf (Xk). This is the steepest descent direction. A moreefficient approach is the Newton method which obtains Dk by solving the following system of equations. V2f (Xk)Dk= -Vf ( X k ) ,
(1 1.81)
Chapter 11
374
where V2f(Xk)Dk is the Hessian matrix evaluated at X k . In Newton’s approach, there are two alternatives to evaluate expressions forthe Hessian. Oneis toapproximate it using the finite difference method. Another involves using autornatic differentiation software. The exact computation of the Hessian can be time-consuming or difficult.Also, the Hessianmay not bepositive-definite as required by Newton’smethod. quasi-Newton methodsbuild up an approximation to the Hessian at a given point X k using the gradient information from previous iterations. Dk is obtained by solving: 1.82)
&Dk = -Vf(Xk),
(1
where Bk is the approximate Hessian for the point X k .
2. Finding the Step Length The step length ak is required to be positive and such thatf(X“’) < F(Xk). The value of ak can be obtained by solving the following one-dimensional optimization problem. Minf ( X k a!
+ aDk).
1.83)
(1
The problem is typically solved by a fast procedure (such as quadratic or cubic interpolation) which is very approximate, since a precise solution is not needed. The above discussion has used a type of minimization technique called the line-search method. A less common technique referred to as the tnrstregion approach calculates the next iterate using X k f ’ = X k + s“,where sk is chosen so as to obtain “sufficient decrease” in f ( X ) . Trust-region methods are useful when the Hessian is indefinite. 3. Treatment of the Constraints
Now we consider how the equality and inequality constraints can be satisfied while minimizingthe objective. The Lagrangian function plays a central role in constrained optimization. (1 1.84) i= I
j= 1
where h E ! V + h is the vector of Lagrange multipliers, and gi and h i are the elements of the constraint vectors. The Lagrangian multipliers measure the sensitivityof the objective function to the corresponding constraints. Estimating the proper value of these multipliers is an important issue in constrained optimization.
Optimal Power Flow
375
Another important (and in many ways, an alternative) functionis the augmented Lagrangian function.Before discussing this function, itis useful to consider an equivalent formulation of the optimization problem. Each functional inequality can be replaced by an equality and a bound constraint. As a simple example ofhow thiscan be done, consider an inequality hl(z,)5 0. This can be written as hl (21) 2 2 = 0, and z2 2 0, where z2 is referred to as a slack variable. Thus, a number of new slack variables are introduced correspondingto the number of functional inequalities resulting in the following formulation.
+
Minimize f ( X )
(1 1.85)
X
Subject to: (1 1.86)
C ( X )= 0
(11.87) where C : %m+n+b + is the vector function of equalities, and X E is the vector ofdecision variables obtained by augmenting the original vector with the slack variables. The augmented Lagrangian function can now be defined as
1 L A X , A, p) = f ( X ) - ATC(X) j ( x ) T C ( X ) ,
+
(1 1.88)
where p is some positive penalty constant. Choosing a proper value for the penalty parameter p becomes an important issue. If p is too large, the efficiency of the solution approach will be impaired. A large p is also likely to lead to ill-conditioning of the Hessian of the augmented Lagrangian function and, consequently, cause difficulties for methods thatrely on such a Hessian or a suitable approximation. If p is too small, the solution approachmay not be able to converge to a solution thatsatisfies equation 11.86. Depending on the solution approach to be discussed, either the original formulation or the formulation given in equations 11.85 to 11.87 are used. A concept that is usedin many approaches is that of active or binding constraints. An inequality constraint is said to be active or binding if it is strictly satisfied. Consider the inequality constraints. We can define a set such that H,(X)= 0 is the active subset of the set of inequalities. Note that the composition of this subset varies with the iterateX k . Thus H,(Zk) is the subset of active inequalities corresponding to Xk.By definition, all equality constraints are always active. So, the overall set of active constraints A(Xk) is the union of the sets G ( X k ) and H,(Xk).
Chapter 11
376
B. SequentialQuadraticProgramming This algorithm is an extension of the quasi-Newton method for constrained optimization. The method solves the original problem by repeatedly solving a quadratic programming approximation. A quadratic programming problem is a special case of an NLP problem wherein the objective function is quadratic and the constraints arelinear. Both the quadratic approximation of the objective and thelinear approximation of the constraints are based on Taylor series expansion of the nonlinear functions around the current iterate Xk. The objective function.f(X) is replaced by a quadratic approximation; thus:
1 g"(D) = Vf ( X k ) D - DTV:zL(Xk,hk)D. 2
+
(1 1.89)
The step Dk is calculated by solving the following quadratic programming subproblem. Minimize qk(D)
(1 1.90)
D
Subject to:
+ H ( X k ) + I(Xk)D 5 0,
(1 1.91)
G(Xk) J(Xk)D = 0
1.92)
(1
where J and I are the Jacobian matrices corresponding to the constraint vectors G and H, respectively. The Hessian of the Lagrangian o?l,L(Xk,hk)that appearsin the objective function, equation 1 1.90, is computed using a quasi-Newton approximation. Once Dk is computed by solving equations 1 1.90 to 1 1.92, X is updated using Xk+' = Xk
+a k D k ,
(1 1.93)
where ak is the step length. Finding akis more complicated inthe constrained case. This is because akmust be chosento minimize constraint violations in addition to minimizing the objective in the chosen direction Dk. These two criteria are often conflicting and thus a merit function isemployed to reflect the relative importance of these two aims. There are several ways to choose a merit function with one choice being:
Optimal Power Flow
377
where v E %a+b is the vector of positivepenalty parameters, and gi and hi are elements of the constraint vectors G ( X ) and H ( X ) , respectively. For the merit function P1(X) as defined in equation 1 1.94, the choice of v is defined by the following criterion,
vi 2 / A i l ,
i = 1,2, ...,a,a+ 1 , ...,b,
where the Ai are Lagrange multipliers from the solution of the quadratic programming subproblem of equations 11.90 to 11.92 that defines Dk. Furthermore, the step length ak is chosen so as to approximately minimize the function given by
PI ( X k + aDk,v). A different merit function that can be used isknown as the augmented Lagrangian merit function: a
LA(x,A, v) =f ( x ) -
Akgi i= 1
+ 5l
b j= I
+
b @ j - a ( Xv,j ,
A?),
j=a
(1 1.95) where
and gi and hi are elements of the constraint functions G ( X ) and H(X), respectively, v is the vector ofpositive penalty parameters, and Ai are Lagrange multipliers from the solution of the quadratic programming subproblem given by equations 11.90 to 11.92 that defines Dk. If equation 11.90 is used as the merit function, the step length ak is chosen to approximately minimize
+
L A ( X ~ aDk,Ak
+ ,(Ak+'
- A k ) , V),
where D k is the solution of the quadraticprogramming subproblem given by equations (1 1.90) to (1 1.92) and Ak+l is the associated Lagrange multiplier. C. AugmentedLagrangianMethods
These methods are based on successive minimization of the augmented Lagrangian function in equation 11.88 corresponding to the NLP formulation. Therefore these methods solve the following subproblem successively.
Chapter 11
370
(1 1.96) Subject to: Xmin
< x 5 xmax,
1.97)
(1
where LA is the augmented Lagrangian function, hk is the vector of the Lagrangian multipliers that is updated every iteration, and pk is the positive penalty parameter that is updated heuristically. By solving Equations 1 1.96 and 1 1.97,weget X"'. Then A"+' is obtained using hk+' = hk
+ pccxk.
(1 1.98)
This method is relatively unexplored for the OPF problem. D. GeneralizedReducedGradients
The general reduced gradients (GRG) class uses the equality constraints to eliminate a subset of decision variables to obtain a simpler problem. We partition the decision vector X into two vectors X B and X,. X , is the vector of basic variables that we want to eliminate using the equality constraints. X N is the vector of the remaining variables, called norzbasic variables. Then, XB = w(xN),
(1 1.99)
where W ( . )is chosen such that:
The mapping W ( . )as in equation 1 1.99 is usually defined by the implicit relation in equation 11.100. Updates of X N are obtained by solving equation 11.lo0 using an appropriate proceduresuch as Newton'smethod. For example, a Newton update of X B is of the form: (1 1.101) The problem can be formulated as the reduced problem: Minimize f(~ ( X N XN) ),
(1 1.102a)
ZN
Subject to:
x?"
5 XN 5 xN""x.
(1 1.102b)
Optimal Power Flow
379
The vector X , can in turn be partitioned intotwo sets: X, and X,. The fixed variables XF are held at either their lower or upper bounds during the current iteration. The superbasic variables X , are free to move within their bounds. Thus, thereduced problem given by equations 1 1.102a and 1 1.102b is solved through successive minimization of the following subproblem. Minimize f (W(Xs,X F ) , X s X F )
(1 1.103a)
X,
Subject to: (11.103b)
Since the constraints in equations 1 1.103b and 1 1.103~are simple bound constraints, the subproblem canbe solved by using the negative gradient of the objective function in equation 1 1.103 as the descent direction. This gradient, -VxJ( W(X,, XF), XSXF), is referred to as the reduced gradient since it involves only a subset of the original decision variables. If a superbasic variable violates one of its bounds, it is converted into a fixed variable held at that bound. Note that to solve equations 1 1.103a to 11.103c, some GRG methods use the reduced Hessianobjective function to obtain a search direction. Insteadof computing thereduced Hessian directly, quasi-Newton schemes are employed in the space of the superbasic variables. Each solution of equations 11.103a to 11.103~is called a minor iteration. At the end of each minor iteration, the basic variable vector X B is updated using equation 11.101. At this point, check a ismade to see whether certain elements can bemoved fromthe fixed variable set X , into the superbasic set Xs. The composition of all three vectors XB, XF and X s is usually altered at the end of each minor iteration. In other words, the decision vector X is repartitioned between the minor iterations. The GRG scheme was applied to the OPF problem with the main motivation being the existence of the concept of state and control variables, with the loadflow equations providing a natural basis for theelimination of the statevariables. The availability of good load flow packages provides the needed sensitivity information for obtaining reduced a problem in the space of the controlvariables with the load flow equations and theassociated state variables eliminated. 1. OPF Formulation Using Quadratic Programming Reduced Gradient Method
The objective function is cost minimization.
Chapter 11
380
Minimize F(Pc) =
c(aiP i i + piPGi+ vi) i
Subject to :
and dependent variable
Writing the OPF Problem into Mathematical Programming Problems We Have:
1 Minimize F = - U T G U 2 Subject to:
+ RTx+ C
Optimal Power Flow
381
Solve the OPF Problem Using QP (See Figure 11.8).
It is necessary to linearize the nonlinear constraints around the base load flow solution forsmall disturbances. The dependentvariable of load flow X can also be eliminated using the implicit function of the control variable U . First linearize the equality constraint h(u, x), AX = -hiT(hfAU
+ h).
(a)
Now linearize the inequality constraint g(u, x), (b)
g~AU+g~Ax+g50. Combining (a) and (b), we get (gf - gchiTh;)AU
+ (g - hiTh) 5 0.
Thus the quadratic form of the OPF problem is 1 Minimize F = -AUTGAU 2 Subject to: T -T
+ RTAU+ C
T
(IT,' - g x h, h, )Au + (g - KTh) 5 0 AUmin L AU IAhax AXmin 5 -hiTh,TAU 5 AX,,,
- Xbase + 11,-T Axma, = &lax - Xbase + hJTh AUmin = %in - Ubase
kmin
= xmin
Aurnax = urnax - Ubase. The quadratic form of OPF can be expressed as follows.
1 Minimize F = - UTGU RTU 2 Subject to:
+
+C
bmin< AU 5 bmaX or 1 -T-
-T-
MinimizeF=-U G U + R 2
-
AU=6
U+c
Chapter 11
382
Input Power System data, including power flow and OPF data
4 Run Initial PowerFlow to obtain the Base Case solution ~~~
~
3. Select the Objective Function of OPF. E.g. Economic Dispatch 4. Convert it to Quadratic Form I
Determine the control variable U and the dependent variable X e.g. Economic Dispatch,U = [P,, PL,TITand X = [Qc, VD, 8 IT c
4 Select the OPF constraints and linearize them about the base case power plow Solution
I
~~~~~~~
1 Establish Quadratic Programming (QP) OPF model ~~
1
I
""
ObtaintheQPparameters: A , R , G , b ,and T
I ] Set AU=--
FIGURE 11.8 OPF implementation flowchart by quadraticprogrammingmethod.
Optimal Power Flow
383
Q+ UpdateonU:
I
OK+' = O K+aAF
Run Power Flow and Check theConstmints
1
1no
I
Print/Display Optimal Power Flow solution
FIGURE 11.8 (continued)
G O O 0 0 0 where I is an identity matrix. Solve the OPF Problem 1 -T-
Minimize F = - U 2 Subject to:
-
AU = 5.
-T-
GU+R
U+c
I
Chapter 11
304
Use quadratic programming with the reduced gradient method, and assume initial values of U o at k = 0. Step Step Step Step
1 "T
-1-
+ x).
1. Compute hk = A - (G - T - -A b 2. Compute i3r/aU = (G UkR) - AA. 3. Let AU = -(aL/aU). 4. If IAD1 < E , stop; obtain optimal solution. Otherwise go to
next step. Step 5. Compute optimal step size a, a = -(GU~AU X ~ A U ) I ( A U ) ~ ( A ~ ) . "K step 6. Update =U aAU. Step 7. Set K = K 1 and go to Step 2.
+
uK+' + +
E. ProjectedAugmentedLagrangian The projected augmented Lagrangian methodsuccessivelysolves blems of the form: Minimize L;(X, X k , hk,p)
subpro(1 1.104a)
X
Subject to: C k ( X ,X k ) = 0 Xmin -
(1 1.104b)
< x 5 xmax, -
(11.104~)
where L;(x,x k , a k , p)
= f ( x k )- h k P ( c- ck)+ -21p ( c - c ~ >-~ck) (c
+
Ck(X,Xk)= C ( X k ) J ( X k ) [ X - X k ] ,
with Xk J(Xk)
Solution obtained from the kth iteration Jacobian matrix of C ( X ) evaluated at X k ak = VectorofLagrangemultipliers corresponding to X k P = Penalty parameter, adjusted heuristically. The procedureused to solve each subproblem of equations 11.104a to 1 I . 104c is similar to the GRG scheme. The variables are partitioned into basic, superbasic, and nonbasicvariables. The nonbasicvariables are held at one of their bounds. The basic variables are eliminated using the linearized constraints, equation 1 I . 104b, and the reduced problem is solvedto obtain a new value for the superbasic variables. If a superbasic variable reaches one of its bounds, it is converted to a nonbasic variable. = =
Optimal Power Flow
385
The gradient of the augmented Lagrangian is thus projected into the space of the active constraints. The active constraints can be written in the form:
where, B A=[
S
N I ]
is the active constraint matrix. Consider the operator
w=
[-%'"I.
which has the property that A W = 0. That is, this operator W effects a transformation into the null space of the active constraints. Let -VLT be the negative gradient of the objective function in equation 1 1.104a. Then the vector - W T V L i is the negative reduced gradient of the objective function, and points in the direction that lowers the objective without violating the active constraints. Let V 2 L i be the Hessian of the objective function. Then the matrix W T ( V 2 L i ) Wis the reduced Hessian. If the reduced Hessian is positive-definitive and if the reduced gradient is nonzero, then a feasible descent direction Dkcan be obtained using W T ( V 2 L i ) W o k=, - W T V L i and Dk = WD;.
A popular implementation uses a quasi-Newton approach to finding the search direction Dk, by replacing the Hessian with a positive-definitive approximation. Given &,a new estimate of X can be obtained, where ak is obtained, using a line-search such that it lowers the value of the objective in equation 1 1.104a.
F. Discussion on Nonlinear OPFAlgorithms The GRG method was one of the first to be used in OPF packages. Its main attraction is its ability to use standard load flow methods to eliminate the power flow equalities and obtain a reduced problem that is easier to solve. The sequential quadratic programming(SQP) method is better able to handle nonlinear objectives andconstraints present in the OPF. However, sequential quadratic programming is currently not competitive for largescale systems. The same is true of the projected augmented Langrangian
Chapter 11
386
approach, which does not seem to have a future for OPF applications. The interior point methodispresently the most favored method forOPFs, because of the robustness of the underlying approach. 1. Decomposition Strategies
We present some decomposition strategies that can be instrumental in saving extended optimal power flow formulations such as security-constrained OPF. All decomposition strategies aim to solve NH subproblems independently. First, we discuss adding security constraints to the OPF formulation. 2. Adding Security Constraints
The traditional notion of securityhas relied almost exclusively on preventive control. That is, the requirement has been that the current operating point be feasible in the event of the occurrence of a given subset of the set of all possible contingencies. In other words, the base-case control variables are adjusted to satisfy postcontingency constraints that areadded to the original formulation: (11.105a) Subject to: Gi(U,,Z)=O,
i = O , l , 2 ,..., N
Hi(U,, Zi) 5 0,
i = 0, 1,2,.
. . ,N ,
(1 1.105b) (1 1.105~)
where Base-case ith postcontingency configuration N Total number contingencies of considered (i.e., those selected by a security assessrnent procedure) Ui E %"' = Vector of control variables for configuration i Zi E !Xt' = Vector of state variables for configuration i Xi E = [UjXilT = Decision vector for the ith configuration f : %m+n + % 1 = Base-case objective function representing operating costs Gi +3 ' = Vector function representing the load flow constraints for the ith configuration. Hi : am+'' + 9tb = Vector function representing operating constraints for the ith configuration. i =0 i >0
= = =
!Rrn+"
The formulation shown in equations 1 1. I05a through 1 1.105~is very conservativein thatit allows no roomfor postcontingencycorrective
Optimal Power Flow
307
actions. It places much more emphasis on maximizing security than on minimizing operating cost. In today’s competitive environment, such a formulation is not easily justifiable, given that there is a small but nonzero correction time (about 15 to 30 minutes) available for implementing postcontingency changes to the control variables. So, it is preferable to use a corrective control formulation as follows. Minimize f(&)
(11.106a)
X0 t X ,
Subject to: Gi(Xi) = 0,
i = 0 , 1 , 2 , ..., N
Hi(Xi) I0,
i=0,1,2,
tpi(Ui - Uo)5 Oi,
i = 0, 1 , 2 , . . . , N,
..., N
(1 1.106b) (1 1.106~) (1 1.106d)
where &(.) Oi
= =
Distance metric (say Euclidean norm) Vectorof upper bounds reflecting ramp-rate limits.
The last set of constraints, equation 11.106d, called the coupling constraints, reflects the fact that the rate of change in the controlvariables of the base-case (like the real power output of generators) is constrained by upper bounds (which are, typically, system specific). Note that without the coupling constraints, the constraints 11.106b and 1 1.106c, are separable into N + 1 disjoint configuration-specificsets, which indicates the decomposability of the above formulation into N 1 subproblems. The decomposition strategies are based onthe corrective control scheme of equations 11.106a to 1 1.106c, and differ mainly in the manner inwhichthey handle the coupling constraints of equation 11.106d that impede independent solutionsof the subproblems:decomposition strategies are indispensable in handling security and environmental constraints.
+
VIII. ILLUSTRATIVEEXAMPLES Illustrative Example 1 A loss minimization problem for the given 3-bus power system is shown in Figure 11.9. The generator voltages’ magnitude and cost functions are given as
Chapter 11
300
and the loss formulation is given by
Step 1. Expressthe loss minimizationproblem Objective:
+
Minimize PL = 0.06 - 0.30PG1 0.004&
+ O.006p2G2
Subject to:
PC,+ f ' ~ 2+ 0.06 - 0 . 3 0 P ~ + l 0.004&1 + 0.006&2 = 6.0 pu 0 5 PC, 5 4.0 PU 0 5 P C 2 5 3.0 PU.
BUS
2 - j10
,
I,
1
PD3= 4.0 + j2.5 FIGURE 11.9 SinglelinediagramforIllustrativeExample
1.
Bus2
Optimal Power Flow
389
Step 2. Convertthe loss minimization problemintoa mathematical expression. Let
general
vector X = [X,, x21T = LPG1
9
PG21.
Then, the problem is restated as:
+
+
Minimize f ( x ) = 0.06 - 0.30~1 0.004~: 0.006~; Subject to: XI
+ x2 + 0.06 - 0.30X1 + 0.004~:+ 0.006~;= 6.0
0 5 XI 5 4.0 0 5 ~2 5 3.0.
Step 3. Solve this mathematical problem using quadraticinterior point method. Case 1
VAr planning problem. Assume that bus 2 is a PQ bus for the given 3-bus network shown in Figure 11.8. Let there exist a low voltage in the power system such that IV2I = 0.090 pu and IV3I = 0.88 pu. Assume shunt compensators are to be installed at buses 2 and 3 such that the voltage at each bus is raised to 0.95 pu. And it is given that the sensitivities AV2 and A V3 with respect to Aq3 are A V2 = 0.02Aq2 A V3 O.O7Aq2
+ 0.010Aq3 + O.O45Aq3.
Step 1. Expresss the VAr planning problem Minimize Q = (Aq2
+ Aq3)
Subject to: A V2 = 0.02Aq2 AV3 = 0.07Aq2
+ O.OlOAq3 2 0.95 - 0.90 + O.O45Aq3 2 0.95 - 0.88.
Step 2. Convertthe VAr planning problem intoa general mathematical expression. Let vector
Chapter 11
390
Then the problem is restated as Minimizef(x) = xI
+ x2
Subject to:
+ +
0.02~1 0.010~22 0.05 O.O7X1 0.045~22 0.07 x1 2 0 and x2 2 0. Step 3. Solve this mathematical problem using a linear programming method. Case 2
Voltage optimization using quadratic interior point method. The objective function to be minimized is: Nh
F ( V ) = C ( V i - V;om)2 i= I
Subject to: Pi - Pg, Qi
+ Pd,= 0
+ Qd, = 0
- Qgi
Vi
E
(1, Nh)
V i E (1, Nb),
where, from the standard powerflow equations expressedin rectangular form,
i= I
Thus, using the 3-bus test system, the optimization problem becomes:
+
+
Minimize F( V ) = (VI - Vr)2 ( V2- V;)2 ( V3 - V:)2 Subject to:
PI = PC1
QI= Qgl
P2 = Pg2 - 2.0 P3 = 0 - 4.0
Q2 = Qg2 - 1.O Q3 = 0 - 2.5.
Note that Pgl and Pg2 are obtained from the minimal cost calculation, and fromthe data Qgl and Qg2 are specified. Therefore, Qgl = 1.2 puand Qg2= 2.7 PU.
Optimal Power Flow
391
Illustrative Example 2
A power system is shown in Figure 1 1.10. All three transmission lines are assumedidenticalandcanbeelectricallydescribedby ~t representation. System data are shown in Table 11.2. The generator cost functions are:
Find the optimum generation schedule (real power).
I
I
Transmission Line: 300@ 200kV Line) R =35ohnr;, WL =150ohms 0 C = 5 . 0 ~ 1 mhos 0~
FIGURE 11.10 Power system diagram for Illustrative Example 2.
Chapter 11
392
TABLE 11.2 Real P , PG, MW
Bus
Reactive Q, QG, LoadMVAr
1 2
Optimal Optimal *
Unspecified 45 140 50 Unspecified
3
0
Unspecified 50 140
Po Load Qo
25
Voltage kV
220 220 220
The total system load is pD
= = 140
+ PD, + PD,
+ 50 + 140 MW = 330 MW.
Calculating the optimal generation schedule and forming the Lagrangian, we obtain
.'. L = 0.015#
+ 2.0P1 + 0.01& + 3P2 + h(330 - PI- P2).
(The transmission losses are neglected.) By applying the optimality condition,
aL -= 0.02P2 + 3 - h = 0 ap2 aL - 330 - PI + P2 = 0. "
ah
Optimal Power Flow
393
Hence, we obtain h = O.O3P1
+ 2 = O.02P2 + 3
.'. O.O3P1 - O.02P2 - 1 = 0. But 330 - P1
P2
.'.O.O3PI - 0.02(330 - Pi) - 1 = 0 0.03P1 - 6.6
+ 0.02P1 - 1 = 0
0.05P1 - 7.6 = 0 P1 = 152MW. Hence, P2 = 330 - 152 = 178 MW.
Illustrative Example3 A 5-bus system is shown in Table 11.3. With P in MW, the cost functions in dollars per hour are as follows. F1 = O.O06OP:, F2
= O.O075P;,
F3 = 0.0070*,
+ 2.0P1, + 140 + 1.5P2, + 120 + l.8P3, + 80.
Assuming that the voltage limits at all buses vary between 0.95 pu and 1.05 pu, and all generators are rated at 200 MW (see Table 11.4) then:
1. Use the OPF program to obtain the absolute
minimum cost of this system and the real and reactive generation schedule.
TABLE 11.3
Five-bussystemimpedanceandlinechargingdata.
Bus Code Line Impedance Line Charging 1/2 Line Limits Bus From 2
1 1 2 2 2
3 4
Bus i to
3
3 4 5 4 5
j
PU 0.02 + j0.06 30 0.08 j0.24 0.06 jO.18 0.06 jO.18 0.04 jO.12 180 0.01 j0.03 120 0.08 j0.24
+ + + + + +
Susceptance (pu)
j0.030 j0.025 j0.020 j0.020 jO.015 jO.010 j0.025
(MW) 40 50 80
40
Chapter 11
394
TABLE 11.4
Initialgenerationschedule. Bus Voltage,
6=
I &lV@ Bus i
1.060 1.056 1.044 1.041
Power Generation Level Load
Angle Magnitude (MVAr) (MW) (degrees) (MVAr) (MW) (pu)
1 2 3 4 5
pgi
0.0
1.030
-2.27 -3.69 -4.1 6 -5.35
98.4 40.0 30.0 0.0 0.0
Qgi
10 23.2 1530.0 10.0 0.0
Plead
(Soad
0
0
20 45 40 60
10
5
2. Use the OPF program to obtain the loss minimum of this system, the reactive power ofgeneration, and the optimal voltage profile. Solution From the OPF program, we obtain:
1. Absolute minimum cost = 2.7403. Pgl = 97.48 MW, Pg2 = 40.00 MWand Pg3 = 30.00 MW Qg, = -17.86 MVAr, Qg2 = -0.260 MVAr and Qg3 = 33.94 MVAr. 2. Loss minimum = 0.024763, IV,l = 1.04535 pu IV,( = 1.02052 pu, Qg, = -18.87 MVAr, and Qg2 = 1.38 MVAr.
IX. CONCLUSIONS
In this chapter, we discussedgenerallinear and nonlinear programming approaches to the OPF problem. We also presented an extended formulation of the problem to accommodate constraints pertaining to system security aspects. We discusseddecomposition strategies that can be used to solve the extended OPF problem. The OPF problem is in general nonconvex. This impliex that multiple minima may exist that can differ substantially. Very little work has been done towardsexploring this particular aspect of the problem. Furthermore, we have only considered a smooth formulation with continuous controls. However,manyeffective control actions are in fact discrete.Examples include capacitor switching (for voltage violations) and line switching (for line overload violations). Also, the generator cost curves are in reality fairly discontinuous although they are oftenmodeled assmooth polynomials.
Optimal Power Flow
395
Handling these discontinuities and nonconvexities is a challenge for existing OPF methods.
X.
PROBLEMSET
Problem 11.1 Two generators supply the total load 800 MW; the generator cost functions and limits are given as follows.
+
+
fi(PcI)= 850 50PG1 O.OlP&, 50 5 P C l 5 300 MW f 2 ( P G z ) = 2450
+ 48PG2 + 0.003P2,,
50 5 P G , 5 650 MW. Find the optimum schedule using nonlinear programming:
1. neglecting generation limits, and 2. considering generation limits. Problem 11.2
A power system with two plants has the transmission loss equation:
+
pL = 0.3 x IO-~P; 0.5 x ~ o - ~ P ; . The fuel-cost functions are
fi = 8.5P1 + 0.00045# fi = 8.2P2+ O.O012P& and system load is 600 MW. Use the Newton-Raphson method to find the optimal generation schedule. Problem 11.3 A power system withtwo plants has apower demand of 1000 MW. The loss equation is given by
+
pL= 4.5 X 1O - ~ P 2.0 ~ X IO-~P~.
The fuel cost functions are
fi = .00214P1 + 7.74 f2 = .OOI44P2 + 7.72
100 5 PI 5 600 MW 100 5 P2 5 600 MW.
Use linear programming to solve the optimal schedule:
Chapter 11
396
1. neglecting generation limits, and 2. considering generation limits. Problem 11.4 A subtransmission system has three buses which undergo low voltage; that is, VI = 0.91, V2 = 0.89, and V3 = 0.90. It isnecessary to install shunt capacitors at buses 2 and 8, so that these three buses' voltage can be raised to 0.95 pu. The sensitivities VVI, VV2, and VV3, with respect to Aq2 and Aq8, are given as
Find the optimal VAr planning for this subsystem. Problem 11.5 Consider the following quadratic equation for thevoltage deviation optimization problem.
+
Minimize f(V) = (VI - Vy)2 (V2 - Vi)' Subject to:
If Vy = V; = 1.0 pu, solve the QP problem. Problem 11.6
A system has three plants. The system loss equation is given by PL = B1 I PT
+ B22P: + B33P:
(Bll, B22, B33)T = (1.6 x
1.2 x
2.2 x 10-4)T.
The system load is 300 MW. Loss minimization is selected as the objective function. Solve this optimalproblemusing any nonlinear programming technique.
Optimal Power Flow
397
FIGURE 11.11 Figure for Problem 11.7.
Problem 11.7 With P in MW, the cost equations, in dollars per hour, of a large power system are as seen in Figure 11.1 1. The five generator data are asfollows.
+ 1.80P1 + 40 F2 = 0.0030P: + 1.7OP2 + 60 F3 = 0.0012P: + 2.10P3 + 100 F4 = 0.0080Pz + 2.00P4 + 25 F5 = 0.001OPg + 1.8OPs + 120. FI = 0.0015P:
The total system real powerload is 730 MW. Obtain the absoluteminimum cost of this system using the nonlinear programming method. REFERENCES 1. Alsac, O., Bright, J., Prais, M., and Scott, B. Further Developments in LP-Based Optimal Power Flow, IEEE Transactions on Power Systems, Vol. 5, no. 3 (August 1990), pp. 697-71 1. 2. Bacher, R. and Meeteren, H. P. Van Real Time Optimal Power Flow in AutomaticGenerationControl, IEEE Transactions on Power Systems, Vol. PWRS-3 (Nov. 1988), pp. 1518-1 529. 3. Burchett,R. C., Happ, H. H., and Vierath, D. R.QuadraticallyConvergent Optimal Power Flow, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-103 (Nov. 1984), pp. 3267-3275.
-
""1""
" " " " " " " "
"
1_1__
Chapter 11
398
4. Burchett, R. C., Happ, H. H., and Wirgau, K. A. Large Scale Optimal Power Flow, IEEE Transactions on Power Apparatus and Systems, Vol. PAS- 101, no. 10, 1982, pp. 3722-3732. 5 . Delson, J. and Shahidehpour, S. M. Linear Programming Applications toPower System Economics,Planning and Operations, IEEE Transactions on Power
Systems. 6. Dommel, H. W. andTinney, W. F. Optimal PowerFlow Solutions, IEEE Transactions on Power Apparatus and Systems, Vol. 87, 1968, pp. 1886-1 878. 7. El-Keib, A.A., Ma, H., and Hart,J. L. Economic Dispatchin View of the Clean Air Act of 1990, IEEE Transactions on Power Syslems,Vol. 9, no. 2 (May 1994), pp. 972-978. Surveyof theOptimal PowerFlow 8. Galiana, F. D.andHunneault,M.A Literature, IEEETransactionsonPowerSystems, Vol. 6, (August 1991), pp. 1099-x. S. N. Quasi-Newton Method for Optimal Power 9. Giras, T. C. and Talukdar, Flows, Electrical Power and Energy Systems, Vol. 3, no. 2 (April 1981) pp. 5964. IO. Granville, S. Optimal Reactive Dispatch Through Interior Point Method, IEEE Transactions on Power Systems, Vol. PWRS-9, (February 1994), pp. 136-146. 1 1 . Huang, W. and Hobbs, B. F. Optimal SO2 CompliancePlanning Using ProbabilisticProductionCostingandGeneralized Benders Decomposition, IEEE Transactions on Power Systems, Vol. 6, no. 2 (May 1991), pp. 174-180. 12. Huneault,M. andGaliana, F. D. A Survey of theOptimal PowerFlow Literature, IEEE Transactions on Power Systems, Vol. 6, no. 2 (May 1991), pp. 762-770. 13. Lamont, J. W. and Obessis, E. V. Emission Dispatch Models and Algorithms for the 199Os, Presented at the 1994 IEEE/PES Summer Meeting, 94 SM 526-4 PWRS, San Francisco, CA (July 24-28), 1994. 14. Momoh, J. A. and El-Hawary, M. E. A ReviewofSelected Optimal Power Flow Literature to 1993, part I & 11: Nonlinear and Quadratic Programming Approaches,’’ Transactions PES, 1999. 15. Momoh, J. A. andZhu, J. 2. Improved Interior PointMethod toOPF Problems, IEEE Transactions PES, Vol. 14, no. 3, 1999, pp. I 1 14-1 130. 16. Monticelli, S., Pereira, M. V. F., andGranville, S. Security-Constrained Optimal Power Flow with Post-Contingency Corrective Rescheduling, IEEE Transactions on Power Systems, Vol. PWRS-2, no. 1 (Feb. 1987), pp. 175-182. 17. Papalexopoulos, A., Hao, S., Liu, E., Alaywan, Z . , and Kato, K. Cost/Benefits Analysis of an Optimal Power Flow: The PG&E Experience, IEEE Transactions on Power Systems, Vol. 9, no. 2 (May 1994), pp. 796-804. 18. Stott, B., Alsac, O., and Marinho, J. L. The Optimal Power Flow Problem, In Electric Power Problems: Tlte Mathematical Challenge, A. M. Erisman, K, W, Neves, and M. H. Dwarakanath, eds., SIAM, Philadelphia, 1980, pp. 327-351. 19. Talaq, J. H., El-Hawary, F., and El-Hawary, M. E. A Summary of Environmental/Economic Dispatch Algorithms, IEEE Transactions on Power Systems, Vol. 9, no. 3 (August 1994), pp. 1508-1 516.
Optimal Power Flow
~
,I_
-
1
399
20. Talaq, J. H., El-Hawary, F., and El-Hawary, M. E. Minimum Emissions Power Flow, IEEE Transactions on Power Systems, Vol. 9, no. 1 (Feb. 1994), pp. 429442. 21. Talukdar, S.N. and Giras, T. C. A Fast and Robust Variable Metric Method IEEE Transactions on Power Apparatus and Systems, for Optimal Power Flows, Vol. PAS-101, no. 2 (Feb. 1982), pp. 415-420. 22. Talukdar, S N. and Ramesh, V.C. A Multi-Agent Technique for ContingencyConstrained Optimal Power Flows,IEEE Transactions on Power Systems, Vol. 9, no. 2 (May 1994), pp. 855-861. 23. Talukdar, S . N., Giras, T. C., and Kalyan,V. K. Decompositions for Optimal Power Flow, IEEE Transactions on Power Apparatus and Systems, Vol. PAS102, no. 12 @ec. 1983), pp. 3877-3884. 24. Wood, A. J. and Wollenberg, B. F. Power Generatiaon, Operation and Control, Wiley, New York, 1984. to 25. Wu, F., Gross, F., Luini, J. F., and Look, P. M. A Two-State Approach Solving Large-Scale Optimal Power Flows, Proceedings of PICA Conference, 1979, pp. 126-136.
This Page Intentionally Left Blank
Chapter 12 Unit Commitment
1.
INTRODUCTION
Unit commitment is an operation scheduling function, which is sometimes called predispatch. In theoverall hierarchy of generation resources management, the unit commitment function fits between economic dispatch and maintenance and production scheduling. In terms of time scales involved, unit commitment scheduling covers the scope of hourly power systemoperation decisions with a one-day to one-week horizon. Unit commitment schedules the on and off times of the generating units, and calculates the minimum cost hourly generation schedule while ensuring that start-up and shut-down rates, and minimum up- and downtimes are considered. The function sometimes includes deciding the practicality of interregional power exchanges, and meeting dailyor weekly quotas for consumption of fixed-batch energies, such as nuclear, restricted natural gas contracts, and other fuels that may be in short supply. The unit commitment decisions are coupled or iteratively solved in conjunction with coordinating the use of hydro including pumped storage capabilities and ensuring system reliability using probabilistic measures. The function may also include labor constraints due to crew policy and costs, that is, the normal times that a full operating crew will be available without committing overtime costs. A foremost considerationis to adequately adopt environmental controls, such as fuel switching. Systems with hydro storage capability eitherwith pumped hydro stations orwith reservoirs on rivers usuallyrequire one-week horizon times. On theotherhand,a systemwith no “memory” devices and few dynamic components can use much shorter horizon times. 401
402
Chapter 12
Most unit commitment programs operate discretely in time, at onehour intervals. Systems with short horizon times can successfully deal with timeincrements as small as a few minutes. There issometimes no clear distinction between the minute-by-minute dispatch techniques and some of the unit commitment programs with small time increments. Unit commitment has grown in importance recently, not only to promote system economy but also for the following reasons. 1. Start-up, shut-down, and dynamic considerations in restarting modern generating facilities are much more complex and costly than they were for smaller older units. 2. Growth in system size to the point where even small percentage gains have become economically very important. 3. The increase in variation between the peak and off-peak power demands. 4. System planning requires automated computerized schedulers to simulate the effect ofunit selection methodson the choice ofnew generation. 5. The scheduling problem has grown out of the effective reach of the “earlier” techniques because ofthe large varietyof efficiencies and types of power sources. The generation resource mix includes fossil-fueled units, peaking units, such as combustion turbines, stored and run-of-river hydro, pumped storage hydro, nuclear units, purchases and sales over tie-lines, and partial entitlements to units. The application of computer-based unit commitment programs in electric utilities has been slow due to the following reasons.
1. Unit commitment programs are not readily transferred between systems. The problem is so large and complex that only the most important features can be included, and these vary a great deal among systems, thus requiring customized applications. 2. Thereare political problems, constraints, and peculiaritiesof systems that are not easily amenable to mathematical solutions and may be very hard to model in the first place. 3. The operating situation changes so quickly and there is so much objective and subjective information about the system that the input requirements of sophisticated computerized schedulers are discouraging. 4. As in other computer applications areas, developing fully workable systems has been difficult, as has been the building of operator’s confidence.
Commitment
Unit
,
403
I
The unit commitment schedule takes many factors into consideration including: Unit operating constraints and costs; Generation and reserve constraints; Plant start-up constraints; and Network constraints.
II. FORMULATION OF UNITCOMMITMENT
A.
Reserve Constraints
There are variousclassificationsfor reserve and these include units on spinning reserve and units oncold reserve under the conditionsof banked boiler or cold start. Thefirst constraint that must be met is that thenet generation must be greater than or equal to the sum of total system demand and required system reserve. That is, N
Pg,(?)>_ (Net Demand + Reserve).
(12.1)
i= 1
In the case where units should maintain a given amount of reserve, the upper bounds must be modified accordingly. Therefore, we have: =
y- p F * e .
Demand
+ Losses 5
(1 2.2) N
N
Pgi -
(12.3)
( 1 2.4) where Ccold a t CL Co
= =
= = =
Cost to start an offline boiler Unit's thermal time constant Time in seconds Labor cost touptheunits Cost to start up a cold boiler,
and
( 1 2.5) where CB = Cost to start up a t = Time in seconds.
banked boiler
404
Chapter 12
B. ModelinginUnitCommitment The nomenclature that isusedinthis chapter for the unit commitment problem is summarized below. Additional terminologies are defined when necessary. F = Totaloperation cost onthe power system Ei(t) = Energy output of the ith unit at hour t Fi(Ei(t))= Fuel cost of the ith unit at hour t when the generated power is equivalent to E,(t) N = Total numberof units in the powersystem T = Total time under which unit commitmentis performed P,,(t) = Power output of the ith unit at hour t Pgl(t) = Constrained generating capability of the ith unit at hour t = Maximumpower output of theith unit = Minimum power output of the ith unit Si(t) = Start-up cost of the ith unit at hour t J;(t) = Ramplingcostof the ith unit at hour t PD(t) = Net systempower demand at hour t PR(t) = Net system spinning reserve at hour t A(t) = Lagrangian multiplier for the system power balarzce cor?straint at hour t p(t) = Lagrangian multiplier for thesystem reserve constraint at hour t uj(t) = Commitment state of the ith unit at hour t
cT
Now, the objective function of the unit commitment problem is the sum of the fuel costs of all the units over time. This objective can be represented mathematically as ( 1 2.6)
The constraintmodels for the unit commitment optimization problem are as follows. Systemenergybalance
1. 1
+
x j [ u ( t ) P g l ( t ) ui(t - l)Pg,(t- l ) ] = PD(t) for t i= 1
2. System spinning reserverequirements
E
(1, T).
(12.7)
Unit Commitment
405
(1 2.8)
3.
Unit generationlimits 5 Pgt(t)5 FgI(t) for t
E
(1, T ) and i
E
{l,N}.
(1 2.9)
4. Energyandpowerexchange 1 Ei(t) = j [ P g i ( t ) Pgi(t- l)]
+
C.
for t E { 1 , T } .
(12.10)
Lagrangian Function for UnitCommitment
This is expressed as
T
N
Therefore, the objective function in the optimization problem can be stated as Minimize L{Pgi(t),ui(t), i ( t ) , b(t)} T
N
(12.12) T
where
Chapter 12
406
(12.13)
?(O) = 0.5i(1)
?(l) = 0.5(i(l)
+ h”(2))
h”’(T- 1) = 0.5(i(T - 1)
(12.14)
+i(T))
?(T) = 0.5h”(T).
(12.15) (12.16)
Here we note that?(0)Pgo(O)for i E { 1, N } are given by the initial conditions and thus can beignoredinsearching for the optimal unit commitment scheme. Minimize L{Pgi(t),ui(t), i ( t ) , f i ( t ) } T t= I
T
(12.17)
t= 1
111.
OPTIMIZATIONMETHODS
A.
PriorityListUnitCommitmentSchemes
Economicschedulingtechniquesuseduntil1958required that each unit operate at the same incremental operating cost. The unit commitment strategy was to drop a unit entirely from the system whenever the incremental cost calculation left it operating at below 10 to 25% of its rated maximum capacity. The reasoning is that below some point the fixed operating costs make the unit too expensive to operate. The other earlier techniques have been derived mainlyfrom a method introduced by Baldwin et al. [2] in 1959, where for the first time the start-up costs and the minimum down-time requirements wereconsidered.These methods are commonly called priority list or merit order procedures, and they are now in common use. The unit commitment techniques developed in the 1950s extended the previous incremental cost techniques to include minimum down-time and start-up costs. They built up strict priority of shut-down rules for different reasons, that is, for different daily load shapes. In a number of simulations,
Unit
407
with the load decreasing the least efficient unit among those that were on, testing for possible shut-down proceeded as follows.
1. Was it possible to restart this unit by the time the load reached its present level again? 2. Did the restart cost exceed the potential operating savings? These priority lists were developed well ahead of time, and in actual operation the units were dropped from the system if they were: 1. next in line on the priority list, and 2. there was not less than a predetermined critical interval before system loads would rise again to the same level (not necessarily the minimum down-time).
A number of refinements have been made on thisoriginal priority list method. One improvement was the addition of checks of the system spinning reserve at each trial scheduling [13]. Pseudoincremental costs were also introduced and iteratively adjusted to encourage the consumption of the appropriate quotas of fixed-batch energy supplies. In the prioritylist unit commitment programs developed later, including those currently in use, the actual priority lists are, in effect, developed especially for the situationat hand in an iterative procedure that is generally deeply embedded in complex heuristic algorithms.Additionalfeatures incorporated in these different heuristic techniques are energy interchange modeling, different start-up and shut-down orderings, unit response rates, minimum up- and down-times, transmission penalty factors, and local area “must run” considerations, as well as others. These features are not particularly difficult to model, but the demanding task is to take all the customselected features for a particular system and construct a manageable scheduling program. Although the concept of the priority listwas introduced only as a cursory first attempt at scheduling, it has remained one of the primary methodsfor using an approximation to reduce the dimensionality and complexity of the most sophisticated scheduling mechanisms. This method appears many times in dynamic techniques andinactualindustrial applications. 1. Priority Criteria
The core of the prioritylist-based unit commitment scheduling program is a mechanism for ordering the unitsof the system according to a certain economic criterion so that the least expensive units are placed at the top of the list, and then proceeding to the more expensive ones. A number of variants have been used in the literature.
Chapter 12
408
TypeI. Fuel Cost-BasedLists. Static priority lists are obtained based on the averagefuel cost ofeach unit operating at a certain fixed fraction of the maximum output. Thus the criterion is:
(12.18) where Priority index for the ith unit based on its average fuel cost Heat rate curveof the ith unit in MBTU/h = Fuel cost of the ith unit in $/MBTU Cax = Maximum power of the ith unit in MW x = Fixed fraction of the maximum output of the ith unit. Mi Hi Fi
=
=
In many instances, the full load values are used. The full load average cost is simply the net heat rate at full load times the fuel cost per BTU. An alternative static ranking procedure, called the equal Lambda list, is basedon the assumption that units are operatingat the same incremental costand finding the cost of fuel per unit output of the unit. (12.19)
Type II. Incremental Fuel Cost-Based List. In Pang and Chen [191 the criterion for placingindividualunits at various priority levelsis the average incremental fuel cost. (1 2.20)
This is generally equivalent to using a one-constant step unit incremental fuel cost, or unit full load fuel cost. TypeIII.Incremental Fuel Cost with Start-up Cost-BasedList. In this case, the incremental fuel cost at the unit's point of maximum efficiency plus the ratio of the unit start-up cost to the expected energy produced by the unit producing maximum efficiency output for the minimum expected run-time of the unit before cycling in hours is used as the priority order criterion.
L a x
+
(A)
(12.21)
L a x 7
where
I
k
Unlt Commltment
Hi Fi
409
Heat rate curve of the ith unit (MBTU/h) Fuel cost of theithunit($/MBTU) Si . = Start-up fuel cost of the ith unit TP" = Minimum expected run-time of the ith unit before cycling (h) = =
Incremental fuel cost of the ith unit at maximum efficiency = Start-up cost component of the ith unit. Vmax
The output power is computed at the maximum efficiency, qmax. Type ZV. Dynamic Priority Lists. In this approach, the ordering is based on economic dispatch results rather than theraw cost data. Toobtain a dynamic priority list, one begins with the load level equal to a preselected percentage of thetotal capacity of units to be rankedandobtainsan economic dispatchsolution including losses.Basedon theoptimum generation level for each unit a measure of a unit's fuel cost is obtained as
,I?[.
Index of Fuel Cost =
(1 2.22)
where Li is the loss penalty factor of the ith unit. The highest cost unit is determined along with the optimal cost. The highest cost unitis removedand placed in the ranking as the most expensive. The system load is reduced by the capacity of the unitremoved and asecond dispatch is calculated; the highest cost unit for this load is removed and included in the list as the second most expensive. The process is continued until the system load is at a level approximating thebase load. The remaining units are ranked using a static priority list. 8. A Simple Merit Order Scheme Most priority list-based unit commitment schemes embody the following logic.
1. During each hour where theload is decreasing, determine whether shutting down the next unit on the priority list will result in sufficient generation to meet the load plus spinning reserve requirements. If not, the unit commitment is not changed for the hour considered. 2. If the answer to Part 1 is yes, determine the number of hours h before the unitwill be needed again when the loadincreases to its
Chapter 12
410
present level. Determine whether h is greater than the minimum shut-down time for the unit. If not, the unit commitment is not changed for the hour considered. 3. If the answer to Part 2 is yes, calculate the following. 3.1 Sum of hourly production costs for the next 12 hours with the candidate unit up. 3.2 Sum of hourly production costs for the next h hours with the candidate unit down plus the start-up cost for either cooling the unit or banking it. 4. Notably, if costs in Part 3.2 are less than those of Part 1, then the unit is shut down; otherwise, the unit is kept on. IV.
ILLUSTRATIVEEXAMPLE
A.
LagrangianRelaxationApproachtoUnit Commitment
The Lagrangian relaxation methodology has been demonstrated to have the capacity to handle systems consistingof hundreds of generating units effectively. The approach is claimed to be more efficient than other methods in solvinglarge-scale problems, andcan handle various constraints more easily. The Lagrangian relaxation approaches acknowledge that the unit commitment problem consists of several ingredients. 1.
The costfunction, whichis the sumoftermseachofwhich involves a single unit. The unit commitment problem seeks to minimize: (12.23) where the individual terms are given by:
+
Fi(Pi(t), si(t), ui(t))= Ci(f'g,(T)) tli(t)[l - ui(t - 1)1Si(xi(t)). (12.24)
2. A set of coupling constraints (the generation and reserve requirements) involving all the units, one for each hour in the optimization period. N
(12.25) for all t E { 1, T}and requirements n. The first requirement is the power balance constraint:
Unlt Commitment
411
(1 2.26) where D(t) is the generation requirement at time t . As a result in equation 12.8, with n = 1, we have: (12.27) (12.28) The second requirement is the spinning reserve requirement written as N
(12.29) i= I
In equation 12.12, PR(t) is the MW spinning reserve requirement during hour t . Therefore in equation 12.8, for n = 2, we have:
( I 2.30) (12.3 1) 3. A set of constraints involving a single unit Li(Pg,, ui, t ) I0 for all i
E
{ 1, N } ,
2.32) (1
where
The constraints of equation 12.15 involve minimum up- and downtimes, as well as unit loading constraints. In nonlinear programming language, the posed problem is referred to as the primal problem. The Lagrangian relaxation approach is based on Everett's work [7], which showed that an approximate solution to the primal problem can be obtained by joining the coupling constraints to the cost function using the Lagrange multipliers A, to form the Lagrangian function. T
N
t=l i=l
T
N
(12.33)
The multipliers associated with the nth requirement for time t are denoted A,,(?). In expanded form the Lagrangian function is:
412
Chapter 12
(1 2.34) where hl = Lagrangian multiplier without spinning reserve constraints
h2 = Lagrangian multiplier with spinning reserve constraints.
The resulting relaxed problem is to minimize the Lagrangian function, subject to AI(t) 2 0 and A2(t) 2 0. In addition, Pgl(t), ui(t),and xi(t) should satisfy equation 12.15, which is given by
(12.35) The dual problem is: Maximize Ldual(h”(t)I,h”l(t)).
2.36)
(1
The formulation involves the maximization of a minimum where the solution of the dual problem is an iterative process. For a fixed iland i2. Max Ldual(i(t)l,i l ( t ) ) is determined by minimizing the right-hand side of equation 12.15. The global procedure is represented by the update of the multipliers, and its objective is to maximize Ldual (h”(t)l,i l ( t ) ) . Subsequent to finding the dual optimal a search for a feasible suboptimal solution is conducted. The flowchart of the Lagrangian relaxation method is shown in Figure 12.1. Determining Ldual(h”(t)l,h”, ( t ) ) is much simplerthan thesolution of the primal problem for the following reasons.
Unit Commitment
413
* + 1,
ReadData Initialize
I
I
Performpreliminary
dispatch
&
and
constraints satisfied?
Areallthermal Solve the
~
~~
Does thedual
function\YES
/-
Are all thermal
converge to the optimal?
I
4
YES
Solve final economic dispatch andimprove the schedule
rll Print results
FIGURE 12.1 Flowchart of Lagrangian relaxation algorithm for unit commitment.
1 . The cost function can be written as a sum of terms each involving only one unit. 2. The coupling constraintsbetweentheunitshavealreadybeen relaxed. 3. Since each of the constraints involves one unit, the operation of each unit can be considered independently.
414
Chapter 12
B. Single Unit Relaxed Problem. The minimization of the right-hand side of equation 12.13 can be separated into subproblems, each of which deals withone generating unit only. Based on equation 12.16, the single-unit relaxed problem is stated as Minimize L(Pgi(t),zri(t))
(1 2.37) Alternatively, based on equation 12.14, the problem is restated as
subject to the constraints of equations X, written as pEinIpg,(t)5
PY
if ui(t)= o if - xpwn5 xi(t) 5
Pgi(t)= 0
(1 2.39)
-1
2.40) (1
and the minimum up- and down-time constraints Uj(t)
=1
if 1 5 x&) Ixyp if - xpOwn < xi(t) 5 -I,
q ( t )=0
(12.41) 2.42) (1
where xdt)
x+p-
Cumulative up-time if xl(t) > 0 and the cumulative down-time xi(t) > 0 = Minimum up- and down-times of the ith unit, respectively. =
Furthermore, xi(t) can berelated to ui(t) by the difference equations in Table 12.1. This problem can be solved easily by dynamic programming or any other method. The state variables are just xi(t). The number of required up states is xyp and the number of required down states is: Max (xtown, x?''), where
2.43)
(1
Unit Commitment
415
TABLE 12.1 Difference equations that relate
Xi(f)
to
Ui(f).
Condition
xcool
required for a unit to cool down completely so that the start-up cost is independent of down-time for down-times greater than x?'.
= Time
The operating limits and the minimum running and shut-down time constraints are treated implicitly by this method. The dualproblem is then decoupled into small subproblems, which are solved separately with the remaining constraints. Meanwhile, the dual function is maximized with respect to the Lagrangian multipliers, usually by a series of iterations based on the subgradient method. C.
LagrangianRelaxationProcedure
Implementing the Lagrangian relaxationmethod involves the following key steps.
1. Find the multipliers h,(t) to obtain the solution to the relaxed problem near the optimum. 2. Estimate how close the solution obtained is to the optimum. 3. Obtain the actual solution to the relaxed problem. To start,Everett [7] shows that if the relaxed problem is solved withany sets of multipliers A,(?), the resulting value of the right-hand side of equation 12.8 is Ri(t). That is, (12.44) where Pi,(?), UT(?) is the optimal solution. This implies that Pi,(?)and uT(t) yield the optimum solutionof the original problem with PR,,(?)replaced by Rm.
Chapter 12
416
The optimization requirement is met ifA,,(?) can befound such that the resulting R:(t) are equal to PR,,,(t). Unfortunately, this can not always be done. As a result there will be a difference between the cost obtained by solving the relaxed problem or dual and the optimum cost for the original problem. This difference is referred to as the duality gap, andcan be explained graphically as shown in Figure 12.2. The lower curve is a plot of Ldual(i1, i2), which has been defined by equation 12.15 as
(1 2.45)
The minimization is with respect to Pi and ui, and the plot corresponds to various valuesof the multipliers il and &. The following points are identified in the graph. Point A is a known solution to the dual optimum (through the iterations). Point B is the unknown optimal solution to the dual problem. OBJECTIVE
T Primal Optimal Unknown Unknown
Dual Solution
A
“.Dual Optimum Defect DUAL
OBJECTIVE
b MULTIPLIERS
FIGURE 12.2 Duality gap of a relaxed problem.
Commitment
Unit
417
The difference dl, between the value of L d u a l ( i l , i 2 ) at Point A and the dual optimum at Point B, is a defect that can be improved upon by optimizing the dual problem. The upper curve corresponds to the objectiveof the primal problem defined by equation 12.6 as T
N
t=l
i=l
(12.46) The points identified on the curve are: Point C is the unknown optimal solution to the primal problem, and corresponds to the minimum cost for a feasible solution. Point D corresponds to the value of the primal cost correspondingto the dual solution of Point A. The difference d3, between C and D, is a defect that can be improved by further optimizing thedual or primal problems. The difference d2, between the unknown optimum value of the primal problem (Point (C) and the unknown optimum value of the dual problem (Point (B), is the duality gap. Duality theory shows that fornonconvex problems there will typically be a duality gap. Since the commitment decision variables xi(t) are discrete, the unit commitment problem is nonconvex. The duality gap,however, has been shown [2] to go to zero as the problem sizegetsbigger. For large problems, the duality gap is less than 0.5 percent 123. Duality theory also generates guidelines on how to update the multipliers An(t) so that the solution to therelaxed problem is near the optimum solution. Let L d u a ] ( k n ( t ) ) be the value of the Lagrangian at the solution to the relaxed problem; then good values of the multipliers A n ( t ) can be obtained by maximizing Ldual(An(t)) for all positive An(t). A number of approaches have been used to maximize the dual function Ldual()Ln(t)). The first, and most popular, involves usingthe subgradient method. This is a generalization of the gradient orsteepest descent method for nondifferentiable functions. In general, the Lagrange function Ldual(At1 ( t ) ) is nondifferentiable. The subgradient of L d u a ] ( h n ( t ) ) with respect to one of the multipliers kn(t) is given as (1 2.47)
where p*,(t) and u;(t) are the solution to the relaxed problem with multipliers An(t). That is, the derivative of the Lagrangian corresponding to a change in An(t) is equal to the difference between the requirement and the
Chapter 12
418
value of the left-hand size of the constraint evaluated at the solution of the relaxed problem. The subgradient method to update k,,(t) is:
(12.48) where hi(t)
tk
= =
kth update of h,,(t) Scalar step length.
A number of forms of tk could be used as long as k -+
00
and
00
Et"-
00.
(1 2.49)
k= I
+
Many authors have used the form F = l/(c dk) where c and d are constants. Different constants would be given for the different requirements as long as the conditions on tk are met. Fisher [3] recommends the following form. b
(1 2.50) J
Lauer et al. [ 151 use a different approach to minimizing the dual functions that utilizes second derivative information. Different authors use variations of the above Lagrangian methods to ensure that the generation and reserve requirements are met and that the algorithms converge to nearoptimal solutions. These variations include replacing the spinning and supplemental reserve requirementsby constraints that just involve the units' upperlimits, and using modifications to the subgradient formulae. D. Searching for a Feasible Solution
The problemconsidered is very sensitiveto changes ofmultipliers. Therefore it is important to start the search for a feasible solution at a point that is fairly close to the dual optimal. The decoupled subproblems of the dual problem interact through the two Lagrangian multipliers. These multipliers are interpreted as the prices per unit power generation and spinning reserve, respectively,that thesystem is willing to pay to preserve the power balance and fulfill the spinning
Commitment
Unit
419
reserve requirement during each hour. Increasing A1(t)and A2(t)may lead to the commitment of more gneerating units and an increase in the total generation andspinning reserve contribution during hourt. The reverse effect is obtained by decreasing the two multipliers. The feasible search is based on the above relationship between the unit and X2 are commitment and the Lagrangian multipliers. The values of adjusted repeatedly, based on the amount of violation of the relaxed constraints (thepower balance and spinning reserve constraints). Therefore the subproblems are solved after each adjustment and the iterations continue until a feasible suboptimal solution is located. The commitment schedule isverysensitive to the variation of the multipliers. For example, if a system contains several units whose cost characteristics are nearly identical, aminor modification of theLagrangian multipliers during a particular hour may turn all of these units on or off provided that theoriginal values of multipliers during that hour areclose to the incremental costs of generations and spinning reserve contributions of these units. Thus themodification of the multipliers should be determined in an appropriate manner; otherwise, the number of committed units during some periods may be more than rquired. Note that the commitment of a generating unit depends on the values of the multipliers and the commitment states during preceding hours due to the minimum running and shutdown time constraints. Thisdependence complicates determining the appropriate values of the multipliers. A simple algorithm is presented in [5] to find a feasible solution with an additional set of restrictions designed to limit unnecessary commitment of the generating units. An economic dispatch algorithm is then applied to this commitment schedule to find the exact power generation of each generating unit and improve the generating schedule. The flowchart of the searching algorithm, which is implemented to find the feasible solution is shown in Figure 12.3. Considering the MW spinning reserve constraints, the inequality of equation 12.12 does not provide any upper-bound restrictions. Common sense foran economic schedule requires thatthereshouldnot be too muchexcess MW reserve.As a result, in the searching algorithms, the following constraints are included implicitly to test the validity of the commitment schedule.
h”,
N
(12.51)
Selecting PE(t) is guided by the following considerations.
Chapter 12
420 (Fromthedualsolution)
k=O
xol= P 1, x,, = p1and x"
where
,
AI-
2
are values
of thedualoptimal reserve
YES
Are
$. Performpreliminary dispatch
Ai+l=A$k;k=k+l
Are power balance
I
YES
NO
I
I i
I$k+l=I\;k=k+l
Perform final economic dispatch
I Print results
thermal subproblem
A
$. Are thermal all
NO
FIGURE 12.3 Flowchart to find the feasible solution.
1.
Upper bounds limit the solution space to beclose to the dual optimal point. It may however lead to missing the optimal solution. Furthermore, the value of PE(t)may affect the convergence rate. Thus an appropriate choice of PE(t) is necessary. Unfortunately, there is no rigorous basis for selecting PE(t). In the algorithm of [5], a heuristic procedure is introduced.
Unit
421
2. The criterion that theexcess in MW reserve should be reduced to a minimum is not implemented in [5]. This criterionindicates that the shut-down of any committed unit will violate the reserve constraints. The weak points of this criterion are asfollows.
1. The minimum reserve margin does not always correspond to the best dispatch. 2. The computation time for the search process will be increased significantly since the feasible region is tightly restricted. 3. It is possible that none of the solutions can satisfy these tight bounds and the minimum runninglshut-down time constraints simultaneously. The procedure for assigning the values to E(t) suggested in [5] is as follows.
Step 1. For every hour t , P E ( ~is)first set to the sumof the maximum powers of the two most inefficient committed units. This value is selected because the commitment states of these two units are usually subject to the modification of the multipliers during the searching process. It is not a constant and depends on units, whichhavebeen committed according tothe schedule beingexamined. Inthe search process, this bound actually discards those schedules that allow more than two units to be shut down duringhour t without volating the spinning reserve constraints.
Step 2. If a feasible solution can not be found within a reasonable number of iterations, this may be due to the bounds during some hours being too tight. Thus no schedule can satisfy the minimum running/shutdown time and the bounds simultaneously. As a result, for those hours during which no combination of committed units has been found to satisfy equations 12.12 and 12.32, the values of PE(z)are increased during these hours successively until a feasible solution is found. Step 3. The procedure proposed in [5] isclaimed to usuallygive a satisfactory feasible solution before proceeding to the final refinement of the schedule. Even though it can not guarantee the true optimal solution, the extent of suboptimality of the solution canbe estimated. Since the dual optimal solution is a lower bound of the original commitment problem, the relative difference between the cost of the suboptimal schedule and this bound may determine the quality of the solution.
Chapter 12
422
V.
UPDATING A n ( t ) IN THE UNIT COMMITMENT PROBLEM
There exist many techniques for updating lambda in the search algorithms of the unit commitment problem, some of whichare credited to the work of Merlin and Sandrin [171 and Tong andShahidehpour [22]. We nowturn our attention to two commonly used methods. Case A.
Updating A n ( t )
In the updating process of the searching algorithm, values of aAI,k(t) and a&(t) should be determined. These two valuesare set to zero if the powerbalance and reserve constraints are satisfied,respectively, during hour t . However, if either of these two constraints is violated during hour t, the following two methods are applied to determine the unknowns. 1. LinearInterpolation
Define (12.52)
Then 912.53) Define (12.54) Then (12.55) 2.
Bisection Method
(12.56) where
!
Unit Commitment
423
(12.57) with PD < Gmax(t) and
(12.58) (12.59)
with Gmin(t) > P o .
(12.60)
The bounds are adjusted after the calculation of the total generation using the updated multipliers using the following rules.
IF
'
Gk+l(O P&), TI-€EN ( A y ( t ) , Gmax(t)) is replaced by
(Al,k+l(t),Gk+l(t));
OTHERWISE (hT"(t),Gm'"(t)) is replaced by (1, k + l(t), Gk+l(t)). The determination of i3A2,k(t) is based on a similar approach. These methods do not work satisfactorily if they are implemented independently. The first method usually branches back and forth around a feasible solution because the relationship between the generation and the multipliers is stepwise. The second method isdifficult to apply since the generation duringeach hour is not merely determined by its corresponding multipliers; the generation during a particular hour ist a function of all the multipliers, although the multipliers of that specific hour (i3Al(t) and i3A2(t)) may present thedominant effect. In [5], these two methodsare used together. The linear interpolation provides the first fewguesses; then the bisection method is applied. Sometimes the linear interpolation maybe recalled if it is found that the feasible solution at a particular hour does not fall within the bounds, based on the change of multipliers during other time intervals. Using this approach, a feasible solution willbe obtained within a reasonable computation time. Case B. Updating A n ( t ) Two variations are considered.
1. If the reserve constraints are not met,
424
Chapter 12 N
c u : ( t ) * ( t ) = &(t) # Pd(t)
(12.61)
i= 1
with
+ tk ( p , ( t ) - pbk)(t))),
hr+’)(t)= Max (0, h:f)(t)
(1 2.62)
where h f ) ( t )is the kth update of h,,(t)and tk is a scalar step length definedby tk = l/(c dk) with constants c and d . 2. Ifthereserve constraint is met, units designated as online have been loaded to their maximum capacity in an ascending order of incremental cost (priority order). In this case the incremental cost of the last loaded unit is denoted &(t), and is used in the following updating formula.
+
hi,k+l(t)
= hi,k(t)+ (1 - c~m)Bk(t).
(1 2.63)
Here a, is a relaxation constant taken as 0.6. Calculating the Updates for h l ( f ) and
ILq(f)
Using equation 12.32, we obtain: N
ui(t)PEax5 p,(t)
+ PR(t) + p$)(t).
2.64) (1
i= 1
Initially, one sets P!)(t) = 0, and subsequent selection of Pg)(t) is done as follows.
1. If the reserve constraints are met within the tolerance of P$’(t), pD(t)
N
+ PR(t) 5
+
u i ( t ) p y 5 Po(t) PR(t)
+pg)(t).
(12.65)
i= 1
The multiplier and tolerance are left unchanged, A2,k+l
( t ) = h2,k(t)-
(1 2.66)
2. If the reserve constraint is not met within tolerance of P$)(t), the modification to the multiplier is given by Equation 12.29 repeated as
-
where tk is a scalar step length, with
(12.67)
Unit
425
tk=-
c'
1
(1 2.68)
+ d'k '
where c' and d' are constants. The updating of the tolerance term is done under the following considerations.
1. If (1 2.69)
the tolerance is left unchanged and PE+"(t) = @(t). 2. If (12.70) i= 1
+
the tolerance is increased and P P ' ' ( t ) = @(t) E . Here E is a constant whose value is equal to the maximum power of the smallest unit of the system. VI.
UNIT COMMITMENT OF THERMAL UNITS USING DYNAMIC PROGRAMMING
For small and medum size systems dynamic programming has been proposed as a solution technique as it has many advantages, the chief one being a reduction in the dimensionality of the problems. Suppose we have four units on asystem and any combinationof them could serve the single load. These would be a maximum number (24 - 1 = 15) of combinations to test. However, if a strict priority were imposed, there would be only the four following combinations to try. Priority 1.
Unit priority 1 unit priority 1 unit priority 1 unit
+ priority 2 unit + priority 2 unit + priority 3 unit
+ priority 2 unit + priority 3 unit + priority 4 unit.
The imposition of a priority list arranged in order of the full load average cost rate would result in a critical dispatch and commitment only if:
Chapter 12
426
1. No load costs are zero. 2. Unit input-output characteristics are linear between zero and full load. 3. There are no other restrictions. 4. Start-up costs are a fixed amount. Assumptions The main assumptions for applying dynamic programming to the unit commitment problem are as follows.
1. A state consists of an array of units with specified units operating and the rest offline. 2. There are no costs for shifting down a unit. 3. There is a strict priority order, and ineachinterval a specified minimum amount of capacity must be operating. 4. The start-up cost of a unit is independent of the time it has been offline. A.
DynamicProgrammingApproachestoUnit Commitment Problem
1. Backward Dynamic Programming Approach
The firstdynamicprogramming approach uses a backward(intime) approach inwhich the solution starts at the last interval and continues back to the initial points. Thre are Imax intervals in the period to be considered. The dynamic programming equations for the computations of the minimum total fuel cost during a time (or load) period I , are given by the recursive equation: F(I, k ) = Min(Cmin(Z,k )
+ S(1,k : @, k + 1) + Fmin(Z+ 1, k)), (12.71)
where Fmin(l, k) Cmin(I, k) S(I, k : @, k
(1, @I
Minimum total fuel cost from state k in interval I to the last interval Zmax = Minimum generation cost in supplying the load interval I given state k 1) = Incremental start-up cost going from statek in the ith interval to state @ in the ( I 1)th interval = Set feasible states in the interval ( I 1). =
+
+
+
Unit Commitment
427
The production cost C(Z, k) is obtained by economically dispatching the units online in state k. A path is a schedule starting from a state in an interval Z to a final interval Zmax. An optimal path is one for which the total fuel cost is minimum. 2. Forward Dynamic Programming Approach
The backward dynamic programming method does not cover many practical situations. For example, if the start-up costof a unit is a function of the time it hasbeen offline, then the forwarddynamic programming approachis more suitable since the previous history of the unit canbe computed at each stage. There are other practical reasons for going forward. The initial conditions are easily specified and the computations can go forward in time as long as required and as long as computer storageis available. The solution to forward dynamic progamming is done through the recursive equation given by Pang et al. [20], and improved by Synder and Powel[26]. The recursive formula to compute the minimum cost during interval I with combination k is given by
F(Z, k) = Min(Cmin(Z,k )
+ S(Z - 1, k : 1, k) + Fmin(Z- 1, k)),
(12.72)
lLt)
where
Fmin(Z, k) = Minimum total fuel cost to arrive at state ( I , k ) Cmin(Z,k ) = Minimum productioncostforstate ( I , k) S(Z - 1 , k : I , k ) = Transition cost from state ( I - 1, k) to state ( E , k ) k) = kth combination in interval 1. For the forwarddynamic programming approach a strategy is definedas the transition, or path,from one state at given hour to a state at the next hour. Now let X equal the numberof states to search each period and letN equal the number of strategies, or paths to save at each step. Figure 12.4 clarifies this definition. Figure 12.5 shows the application of the recursive formula for the forward dynamic programmingapproach, which can be summarized in the following steps. (19
Step 1. Start with the first interval Z = 1; enumerate all feasible combinations that satisfy: a. expected load; b. the specified amount of spinning reserve, usually 25% of the load at that time interval.
Chapter 12
420
interval
(0 FIGURE 12.4 States and strategies (N, = 3 and X
+ 5).
In this regard, the economic dispatch problem will be performed to calculate the value of Cmin(Z,k) for each feasible kth combination at stage 1. Step 2. For stage ( I l), enumerate all the feasible combinations and perform the economic dispatch solution for the new load level at stage ( I 1) to calculate Cmin(Z 1, k).
+
+
l o
+
0
0 0 interval ((+I)
A: set of combinationsto be searched at stage(1-1)
FIGURE 12.5 Application of forward dynamic programming technique.
Unit
429
+
Step 3. Check if the transition of state k at stage Z to state j at stage ( I 1) satisfies minimum up and down constraints if a unit is to be started up or shut down. Notably, if the unit satisfies the minimum down-time constraint then calculate the S(Z, k) as its startup cost. It should be noted that if more than one unit is to be n
started, the S(Z, k ) =
Si(?),where n is the number of units to be i= 1
started. statej atstage I to Step 4. The total cost for making the transition from state j at stage ( I 1) is given by
+ F(Z, k ) = Cmin(Z,k) + S(1-
1, k : Qi,k)
+ Fmin(Z,k).
(1 2.73)
Step 5. Calculate all the F(2, k ) due to all feasible transitions from stage ( I - 1). Step 6. Find the minimum FzOst(K,J ) and save it; alsosave the path that leads to this optimal one. Step 7. Proceed in time to the next stage and repeat Steps 2 to 6 . Step 8. When you reach the last stage calculate the minimum total cost and trace back to find the optimal solution.
The flowchart is shown in Figure 12.6. Figure 12.7 shows a case using forward dynamic programming for three combinations at each stage, for three time intervals.
B. Case Study For this study, two cases are considered: a priority list schedule and the same casewith complete enumeration. Both cases ignore hot-start costs and minimum up- and down-times. In order to make the required computations more efficient, a simplified model of the unit characteristics is used. Four units areto be committed to serve an 8-h pattern for the expected loads. Tables 12.2 through 12.4 show the characteristics, load pattern, andinitial status for the case study.
Case 1. In this case the units are scheduled according to a strict priority order (see Table 12.5). That is, units are committed in order until the loadis satisfied. The total cost for theinterval is the sum of the eight dispatch costsplus the transitional costs for startingany units. A maximum of 24 dispatches must be considered. Table 12.6 shows the makeupof the only states examined each hour in Case 1. Note that the table displays the priority order; that is,
Chapter 12
430
Perform economic dispatchfor all possible combination at interval K
6 F,, (2,k) = Mir2{Pm, (K,I ) t s,
(K- 1, L :K,I ) }
For all feasible statesat stage K.
Save N lowest cost strategies
u Trace
optimal
FIGURE 12.6 Flowchartforforwarddynamicprogramming.
State 5 = unit 3, State 12 = unit 3 State 14 = unit 3 State 15 = unit 3
+ unit 2, + unit 2 + unit 1, and + unit 2 + unit 1 + unit 4.
schedule
Unit Commitment
I= 1
431
1=3
1-2
FIGURE 12.7 Fotward dynamic programming for three combinations.
TABLE 12.2 Unit characteristics, load pattern, and initial status. Unit
Limits Unit Incremental Loading Costs Minimum Time p r P? Rate Heart
(MW)
(MW)
I
-
-
2 3 4
250 300 60
60 75 20
TABLE 12.3 Unit
Up Ai
Full
Down
Load Load
10440 20.34 585.62 9000 19.74 684.74 8730 28.00 252.00 11900
223.54 1 3.00
2 3 4 1
4 5 5 1
Unitcharacteristics,loadpattern,andinitialstatus. Initial Conditionsa Start Cold
-5 8 8 -6 a
N~
- = offline; + = online.
Start-up Costs Cold Hot 150
170 500 0
350 400 1100 0.02
4 5 5 0
Chapter 12
432
TABLE 12.4 Loadpattern. Hour (h)
Load (MW) 450 530 600 540 400 280 290 500
TABLE 12.5 Capacity ordering of the units. Combination Maximum Unit 1234 Units
State 15 14 13 12 11 10 9
4 3 2 1
1111 1110 01 11 0110 1011 1101 I010 0011 1100 0101 0010 0100 1001 1000 0001
0
0000
8 7 6 5
Capacity (MW) 690 630 610 550 440 390 380 360 330 310 300 250
140 80 60 0
Unlt
433
TABLE 12.6 Capacity ordering of the units
State
Unit Combination 1234 Units
Maximum Capacity (MW)
0010 0110 1110 1111
300 550 630 690
~~
5 12 14 15
~
For the first four hours,only the last three states are of interest. Thesample calculationsillustratethethe technique. All possible commitments start from state 12 since this was given as the initial condition. For hour 1, the minimum cost is state 12 and so on. Theresults for the priority ordered case are shown in Table 12.7. Note that state 13 is not reachable in this strict priority ordering. Case 2.
The allowable states are: {0010,0110,1110,1111 ) = {5,12, 14,15} in hour O{Z} = { 12) initial condition. 1 = 1. 1 s t hour
TABLE 12.7 Capacity ordering of the units. Hour
State with Min Total Cost
Pointer for Previous Hour
12 (9208) 12 (19857) 14 (32472) 12 (43300)
12 12 12 14
Chapter 12
434
k 15 14 12
Nk
+
F(1, 15) = P(1,15) S(0, 12; 1 , 15) = 9861 350 = 1021 1 F(1,14) = 9493 350 = 9843 F(1,12) = 9208 0 = 9208.
+ + +
1=2. 2nd hour The feasible states are @ = { 12,14, 15). Therefore, = 3. Suppose two strategies are saved at each stage; then N, is 2.
= 11301 +min(
+ +
350 9208 0 9843
]
= 20860,
and the process is repeated untilthe 4th hour when the final costat I = 4 will be the minimum commitment cost. The complete schedule is obtained by retracing the steps over the specified time period of 4 hours. VII.
ILLUSTRATIVEPROBLEMS
A system has four units; the system data are given in Table 12.8. Solve the unit commitment using 1. Priority order method 2. Dynamic programming 3. Lagrange relaxationmethod.
VIII. CONCLUSIONS
This chapter presented unit commitment as an operation scheduling function for management of generation resourcesfor a short time horizon ofone day or atmost one week. Different unit commitmentoperational constraints were fully addressed and discussed. Several approaches for solving the unit commitmentproblemwerepresented starting from the oldest and most primitive method, the priority list. An illustrative example was presented. A practical approach suitable for large-scale power systems employing the Lagrangian relaxation technique was fully discussed. Different major procedures in problem formulation, search for a feasible solution through the minimization of the duality gap, updating the multiplier, and formation of single-unit relaxed problems were shown. Also, several algorithms employ-
Unit
435
TABLE 12.8a.
Systemdataforproblemillustrativeexample. ~~
Unit
1 2 3 4
250 60
Min
Max (MW) (MW)
8010,440 9,000 300 8,730 11,900
Incremental Load NoHeat Rate Cost Ave. Cost ($/h) ($IMWh)
25 60 75 20
Minimum Full-Load Times (h) ($IMWh)
21 3.00 585.62 684.74 252.00
23.54 20.34 19.74 28.00
Up Down 4 5 5 1
2 3 4 1
Costs Start-up ConditionsInitial Online
Hours Offline Start Cold Cold (-) Hotor (+) (dollar) (dollar)
Units 4
-5 350 8 400 8 -6
TABLE 12.8b.
150 170 500 0
(h) 5
1,100 0.02
5 0
Loadprofileforproblemillustrativeexample. Load Pattern
Interval Time
T MW) (hours) PD( Load 1-6 7-12 13-18 19 - 24
ing the same approach were discussed. An approach suitable for small and medium power systems using dynamic programming was also discussed, as were the different assumptions for applying dynamic programming to unit commitment. A comparison between forward and backward dynamic programming approaches was made,andthecomputational procedures involvedin applying both approaches to unit commitment were shown. The forward dynamic approach was utilized in solving the problem. An illustrative example was presented showing the different computational procedures in solving the unitcommitment problem fora sample power system.
Chapter 12
436
IX. PROBLEMSET Problem 12.1
A system has four units, with the system data as giveninTables12.9 through 12.11. Solve the unit commitment using 1. Priority order method 2. Dynamicprogramming 3.Lagrangian relaxation method 4. Genetic algorithm. Problem 12.2 Tables12.12 and 12.13 present the unit characteristics and load pattern for a five-unit four time period problem. Each time period is two hours long. The input-output characteristics are approximated by a straight line from min to max generation so that the incremental heat rate is constant. Unit no-load andstart-up costs are givenintermsof heat energy requirements. 1. Develop the priority list for these units and solve for the optimum unit commitment. Use a strict priority list with a search range of three (X = 3) and save no more than three strategies (N = 3). Ignore min up/min down-times for units. 2. Solve the same commitment problem using the strict priority list with X = 3 and N = as in Part 1, but obey min up/min downtime rules. 3. (Optional) Find the optimum unit commitmentwithoutuseof the strict priority list (i.e., all 32 units' on/off combinations are valid). Restrict the search range to decrease your effort. Obey the min up/min down-time rules.
TABLE 12.9 SystemdataforProblem 12.1.
Unit
Incremental No-Load Full-Load Minimum Max Min Heat Rate Cost Ave. Cost (MW) ($/MWh) ($/h) (MW) ($/MWh)
Times (h) Up Down
1 2 3 4
100 9,800 200 8,000 250 80
5 5 6 3
25 50 60 30
8,500 10,000
200 600 550 250
30 25 20 24
1 3
4 2
i.
.
.
Unit
437
TABLE 12.1 0 Initial and start-up conditions for the systemin Problem 12.1. ConditionsInitial Online
Start-up Costs
Cold Cold Hot (-) or Hours Offline Start (+) (dollar) (dollar)
Units 4 5
-5 400 8 450 8 -6
(h)
200 220 350
800
0
0.0
5 0
TABLE 12.11 Load profile data for problem 12-1. Load Pattern (MW) PD (hours) LoadInterval TTime 1 2 3 4 5 6 7
500 550 600 560 450 300 350
.
8
500
TABLE 12.12 Unit characteristics for unit commitment. Minimum Net FullIncremental Min No-Load Start-up Times (h) Max Load Heat Heat Rate Unit (MW) cost Cost Rate (MW) ($/h) (M Btu) Up/Down (BtukWh) ($IMWh) 200 40
1 2 3 4 5
60 50
25
11,000 11,433 12,000 12,900 13,500
9,900 80 10,100 10,800 11,900 12,140
50 15 15 5 5
220 4 4
60 40 34
30 25 20 24
-
8 8
4
Chapter 12
430
TABLE 12.13
Load curve and conditions for unit commitment.
Load Pattern Conditions
HoursMWLoad
1. Initially(prior to hour 1)onlyunitisonandhasbeenon
1-2 250
for 4 hrs. 320 2. Ignore losses, spinning reserve, etc. 5-6 110 3.Theonlyrequirement is thatthegenerationbeableto supply the load. 7-8 75 4. Fuelcostforallunitsmaybetakenas1.40R/MBtu. 3-4
Problem 12.3 Given the unit data in Tables 12.14 through 12.16,use forward dynamic programming to find the optimum unit commitment schedules coveringthe eight-hour period. Table 12.17 gives the characteristics of all the combinations you need as well as the operating cost for each of the loads in the load data. A "*" indicates that a combination can not supply the load. The starting conditions are (1) at the beginning of the 1st period units 1 and 2 are up, and (2) units 3 and 4 are down and have been down for eight hours. Problem 12.4 Given the unit data in Tables 12.18 through 12.21,use forward dynamic programming to find the optimum unitcommitment schedules coveringthe eight-hour period.
TABLE 12.14 Unit characteristics for unit commitmenta Unit Limits Unit (MBtu/h)
(MBtu/h)
(MW)
(MW)
1 2 3 4
500 250 150 150
70 40
30 30
No-load Incremental Heat Rate Btu/kWh
9950 10200 1 1000 1 1000
Energy
StartUP Energy
300 21 0 120 120
800 380 110 110
~
aLoad data (all time periods = 2h).
~~~~
Unit Commitment
439
TABLE 12.15 Load duration profile. (MW) f'D(t) Load t Time
600 800 700 950
TABLE 12.1 6 Start-up and shut-down rules. Minimum Up Minimum Down(h)
Time
(h)
Time Unit 2 2
2 2 2 2
4
4
TABLE 12.17 Unit characteristic and operating cost at different loads. Operation Cost for Various (MW)
PD LevelsLoad Units Combinations A B C
1
1
2
3
4
600 980 800 700
1 1
1
0 1
0 0
1
1
6505 6649 6793
1 1
7525 8705 7669 7813
*
* *
8833 10475
TABLE 12.18 Unit cost data.a ~~
~~
~~
Max (Btu/kWh) (MW) (MW)
~~
Incremental No-Load Min Energy Energy Input Rate Heat
Unit
1 250
500
2
~
a
3
150
4
150
70 40 30 30
~~
Load data (all time periods= 2 hrs).
Start-up (MBtuh)
(MBtu)
9,950
300
10,200 11,000 11,000
210
800 380 110
120 120
110
Chapter 12
440
TABLE 12.1 9 Loadprofile. Load (MW)
Time Period
1 2 3 4
600 800 700 950
TABLE 12.20 Start-upandshut-down rules.a Minimum Down-Time
Minimum Up-Time (h)
Units
aFuelcost = l.OOR/MBtu.
TABLE 12.21 Unit combination and operating cost for Problem 3.
Combination A B C a l = up; 0 = down.
Operation Cost for Various PD (MW)
Levels Load Unitsa
1
2
4 3
1
1 1 1
0 I I
1 1
0 0 1
600
700 950 800
6505 6649 6793
7525 7669 7813
-
-
8705 8833 10475
Unit Commitment
441
REFERENCES 1. Ayoub, A. K.andPatton, A. D. Optimal Thermal Generating Unit Apparatus and Systems. Vol. Commitment, IEEETransactions onPower PAS-90, 1971, pp. 1752-1756. 2. Baldwin, C. J., Dale, K. M., and Dittrich, R. F. A Study of Economic Shut Down of Generating Units in a Daily Dispatch, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-78, 1960, pp. 1272-1284. in Interconnected 3. Chowdhury,N. and Billinton, R. UnitCommitment Generating Systems Using a ProbabilisticTechnique,Paper89 SM 715-4 PWRS, IEEE Summer Power Meeting, Long Beach, CA, July 1989. 4. Cohen, A. I. and Wan, S. H. A Method forSolving the Fuel Constrained Unit Commitment Problem, IEEETransactions onPower Systems, Vol. PWRS-2 (August 1987), pp. 608-614. 5. Cohen, A. I. and Yoshimura, M. A Branch-and-Bound Algorithm for Unit Commitment, IEEE Transactions on Power Apparatus and Systems,Vol. PAS102, (Feb. 1983), pp. 444-451. 6. Dillon, T. S., Edwin, K., Kochs, H. D., and Taud, R. J. Integer Programming Approach to the Problem of Optimal Unit Commitment with Probabilistic Reserve Determination, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-97, no. 6, (Nov.-Dec. 1978), pp. 2154-2166. 7. Everett, H. E. Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources,Operations Research, Vol. I, (May-June 1963), pp. 399-417. 8. Graham, W. D. and McPherson, G. Hydro-Thermal Unit Commitment Using a Dynamic Programming Approach, Paper C73452-0, IEEE PES Summer Meeting, Vancouver, July, 1973. 9. Gruhl, J., Schweppe, F., andRuane,M.UnitCommitmentSchedulingof Electric Power Systems, in L. H.FinkandK. Carlson, Eds., Systems Engineering for Power Status and Prospects, Henniker, NH, 1975. 10. Happ, H. H., Johnson, R. C., and Wright, W. J. Large Scale Hydro Thermal Unit Commitment Method and Results, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-90, 1971, pp. 1373-1383. 11. Hobbs, W. J., Hermon, G.,Shebid, S.,and Warner, G. An Enhanced Dynamic Programming Approach for Unit Commitment, IEEE Transactions on Power Systems, Vol. PWRS-3, 1988, pp. 1201-1205. 12. Jain, A. V. and Billinton, R. Unit Commitment Reliability in a Hydrothermal System, PaperC 73096-5, IEEE PESWinterPowerMeeting, New York, 1973. 13. Kerr, R. H., Scheidt, 5. L., Fontana, A. J., and Wiley, J. K. Unit Commitment, IEEE Transactionson Power Apparatus and Systems, Vol. PAS-85, (May 1966), pp. 417-421. M.,Semi Rigorous 14. Khodaverdian, E., Brammellar, A,, andDunnett,R. ThermalUnitCommitmentforLarge Scale Electrical Power Systems, IEE Proceedings, Part C, Vol. 133, no. 4, 1986, pp. 157-164.
442
Chapter 12
15. Lauer, G. S., Bertsekas, D. P., Sandell, N. R., Jr., and Posbergh, T. A. Solution ofLarge-Scale Optimal Unit Commitment Problems, IEEETransactionson Power Apparatus and Systems, Vol. PAS-101, (Jan. 1982), pp. 79-86. 16. Lowery, P. G. Generating Unit Commitmentby Dynamic Programming, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-85, (May 1966), pp. 422-426. 17. Merlin, A. and Sandrin, P. A New Method for Unit Commitment at ElCctricitC de France, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-102, (May 1983), pp. 1218-1225. 18. Momoh, J. A., Zhu, J. Z. Application of AHP/ANP to Unit Commitment in the Deregulated Power Industry, Proceedings ofIEEE SMC'98, CA (1998), pp. 8 17-822. 19. Pang, C. K. and Chen, H. C. Optimal Short-Term Thermal Unit Commitment, IEEETransactionsonPowerApparatusandSystems, Vol.PAS-95,(JulyAugust 1976), pp. 1336-1346. 20. Pang, C. K., Sheble, G. B., and Albuyeh, F. Evaluation of Dynamic Programming Methods and Multiple Area Representation for Thermal Unit Commitments, IEEE Transactions on Power Apparatus andSystem, Vol. PAS100, (March 1981), pp. 1212-1218. 21. Schulte, R. P. Problems Associatedwith Unit Commitment in Uncertainty, IEEETransactions on PowerApparatusandSystems, Vol.PAS-104, no. 8, (August 1985), pp. 2072-2078. 22. Tong, S. K. and Shahidehpour, S. M. A Combination of Lagrangian Relaxation and Linear Programming Approaches for Fuel Constrained Unit Commitment Problem. IEEEProceedings, Vol. 136, Part-C, (May 1989), pp, 162-1 74. 23. Van den Bosch, P. P. J. and Honderd, G. A Solution of the Unit Commitment Problem via Decomposition and Dynamic Programming, IEEE Transactions on Power Apparatus and Systems, Vol. PAS-104, (July 1985), pp. 1684-1690. 24. Van Meeteren, M. P., Berry, B. M., Farah, J.L., Kamienny, M. G., Enjamio, J. E., and Wynne, W. T. Extensive Unit Commitment Function Including Fuel Allocation and Security Area Constraints, IEEETransactionsonPower Systems, Vol. PWRS-1, no. 4, 1986, pp. 228-232. 25. Zhuang, F. and Galiana, F. D. Unit Commitment by SimulatedAnnealing, Paper '89 SM 659-4 PWRS, IEEE Summer Power Meeting, Long Beach, CA, July, 1989.
Chapter 13 Genetic Algorithms
1.
INTRODUCTION
In many engineering disciplines a large spectrum of optimization problems has grown in size and complexity. In some instances, the solution to complex multidimensional problems by using classicial optimization techniques is sometimes difficult and/or expensive. This realization has led toan increased interest in a special class of searching algorithm, namely, evolutionary algorithms. In general, these are referred to as “stochastic” optimization techniques and their foundations liein the evolutionary patterns observed in living things. In this area of operational research, there existseveral primary branches:
1. Genetic algorithms (GAS), 2. Evolutionary programming (EP), and 3. Evolutionary strategies (ES). To date the genetic algorithm is the most widely known technology. This optimization technique has been applied to many complex problems in the fieldsof industrial and operational engineering. In power systems,wellknown applications include unit commitment, economic dispatch, load forecasting, reliability studies, and various resource allocation problems. A.
General Structure of Genetic Algorithms
The typical structure of geneticalgorithms was described byDavid Goldberg [4]. Essentially, genetic algorithms are referred to as stochastic search tech443
Chapter 13
444
niques that are based on the Darwinian thinking of natural selection and natural genetics. In general, genetic algorithms start with an initial set of random solutions that lie in the feasible solution space. This random cluster of solution points is called a population. Each solution in the population represents a possible solution to the optimization problem and is therefore called a “chromosome.” The chromosome is a string of symbols and based on theuniqueness of2-state machines, theyare commonly binary bit strings.
II. DEFINITIONANDCONCEPTSUSED COMPUTATION
IN GENETIC
Genetic algorithms have their foundation in both naturalbiological genetics and modern computer science (see Table 13.1). such, As nomenclature usein this is inherently a mix of both natural and artificial intelligence. To understand the roots of genetic algorithms, we briefly look at the biological analogy. In biological organisms, a chromosome carries a unique set ofinformation that encodes the data on how the organismisconstructed. A collection or complete set of chromosomes is called a phenotype. And, within each chromosome are various individual structures called genes, which are specific coded features of the organism. With this basic understanding, the following terminologies and concepts are summarized. A.
Evolutionary Algorithms
Evolutionary algorithms (EA) represent a broad class of computer-based problem-solving systems. Their key feature is the evolutionary mechanisms that are at the root of formulation and implementation. Of course, evolutionary algorithms by themselves represent a special class of “new” intelligent system used in manyglobal optimization algorithms. Figure 13.1 shows TABLE 13.1
Terminology in geneticalgorithms.
GA Terms 1. Chromosome 2. Gene 3. Alleles 4. Phenotype 5. Genotype 6. Locus
Corresponding Optimization Description Solution set Part of solution Value of gene Decoded solution Encoded solution Position of gene
Genetic Algorithms Neural
445
Artificial Systems (IS)
Systems
Fuzzy Logic (FL)
Expert
-""
A
A
A
+1
1
T
Evolutionary Computing (EC)
Genefie Algorithms (GAS)
"_
t
Evolutlonary Strategies (ES)
FIGURE 13.1 Common classifications of intelligent systems.
the various categories of intelligent systems and the position of the genetic algorithm asone of the morecommonly known evolutionary programming techniques. Overall, evolutionary algorithms share thecommon structure of evolution of individuals in a competitive environment by the processesof selection, mutation,andreproduction. Theseprocesses arefunctions of the simulated performance of each individual as defined by the environment. In evolutionaryalgorithms,auniquepopulation of structures is maintained based on the search operators. Search operations use probabilistic rules of selectionin theevolution process while ensuringthatthe integrity or fitness ofnew generation is continuously improved at each stage of the optimization process. Therefore, the reproduction mechanism is primarily focused on the fitness of the individuals in the population, while exploiting the available information. Furthermore, these robust and powerful adaptive optimization search mechanisms use recombination and mutations to perturb individuals (parentsand offspring), yielding new generations to be evaluated. Over the pastfew decades, global optimization algorithms that imitate natural evolutionary principles have proved their importancein many applications. These applications include annealing processes, evolutionary computations (EC), artificial neural networks (ANN), and expert systems (ES) (See Figure 13.1.) 6. GeneticProgramming Genetic programming is a useful extension of the genetic model of learning or adaptation into the space of programs. In this special type of program-
Chapter 13
446
ming, the objects that constitute the population are not fixed-length character strings that typically encode feasible solutions to the optimization problem. Rather, the objects that constitute the population are programs that yield candidate solutions to the optimization when executed. In genetic programmming, these are expressed as sparse trees rather than lines of code. For example, a simple program to perform the operation X I Y - (A + B) * C would be represented as shown in Figure 13.2. The programs in the population are composed of elements from the function and terminal sets. In genetic programming, the crossover operation is implementedby taking random selections of the subtree in the individuals. The selection is done according to the fitness of the individuals, and the exchange is done by the crossover operator. Notably, in general genetic programming, mutation is not used as agenetic operator. Genetic programming applications are used by physicists, biologists, engineers, economists, and mathematicians.
111.
GENETICALGORITHMAPPROACH
Genetic algorithms are general-purpose search techniques based on principles inspired by the genetic evolutionary mechanism observed in the populations of natural systems and living organisms. Typically, there are several stages in the optimization process: Stage 1 . Creating an initial population, Stage 2. Evaluating the fitness function, and Stage 3. Creating new populations.
A
B
/
C
\
Y
FIGURE 13.2 Simple structure demonstrating the operationX
* Y - (A + B) * C.
Genetic
A.
447
GA Operators
Various operators are used to perform the tasks of the stages in a genetic algorithm: the production or elitism operator, crossover operator, and the mutation operator. The production operator is responsible for generating copies of any individual that satisfy the goal function. That is, they either pass the fitness test of the goal function or otherwise are eliminated from the solution space. The crossover operator is used for recombinationof individuals within the generation. The operator selects two individuals within the current generation and performs swapping at a random or fixed site in the individual string (Figure 13.3). The objective of the crossover process is to synthesize bits of knowledge from the parentchromosomes that will exhibit improved performance in the offspring. The certainty of producing better performing offspring via the crossover process is one important advantage of genetic algorithms. Finally, the mutation operator is used as an exploratory mechanism that aids the requirements of finding a global extrema to the optimization problem. Basically it isused to randomly explore the solution space by "flipping" bits of selected chromosomes or candidates from the population.There is an obvious trade-off in theprobability assigned to the mutationoperator. If the frequency were high, the genetic algorithm would result in a completely random search with a highloss of data integrity. On the other hand, too low an activation probability assigned to this operator may result in an incomplete scan of thesolution space.
!
Crossover site Before crossover
Afier crossover
FIGURE 13.3 The crossover operation on a pair of strings.
448
Chapter 13
B. MajorAdvantages Genetic algorithms havereceived considerable attention regarding their potential as a novel optimization technique. There are several major advantages when applying genetic algorithms to optimization problems. 1. Genetic algorithms do not have many mathematical requirements for optimization problems. Due to their evolutionary nature, genetic algorithms will search for solutions without regard to the specific inner workings of the problem. They can handle any kind of objective function and any kind of constraint (i.e., linear or nonlinear) defined on discrete, continuous, or mixed search spaces. 2. The ergodicityof evolution operators makes genetic algorithms very effective at performing global search (in probability). The traditional approaches perform local search by a convergent stepwise procedure, which compares the values of nearby points and moves to the relative optimal points. Global optima can be found only if the problem possesses certain convexity properties that essentially guarantee that any local optima is a global optima. 3. Genetic algorithms provide us with a great flexibility to hybridize with domain-dependent heuristics to make an efficient implementation for a specific problem.
C. Advantages of Genetic Algorithms over Traditional Methods The main advantages that GAS present in comparison with conventional methods are as follows. Since GASperform a search in a population of points and arebased on probabilistic transition rules, they are less likely to converge to local minima (or maxima). GAS do not require "well-behaved"objective functions, and easily tolerate discontinuities. GAS are well adapted to distributed or parallel implementations. GAS code parameters in a bit string and not in the values of parameters. The meaning of the bits is completely transparent for the GA. GAS search from a population of points and not from a single point. GAS use transition probabilistic rules (represented by the selection, crossover, and mutation operators) instead of deterministic rules. Nevertheless, the power of conventional methods is recognized. The GA should only be used when it is impossible (or very difficult) to obtain efficient solutions by these traditional approaches.
Genetic Algorithms
449
IV. THEORY OF GENETICALGORITHMS
A. ContinuousandDiscreteVariables Real values can be approximatd to the necessary degree by using a fixedpoint binary representation. However, when the relative precision of the parameters is more important than the absolute precision, the logarithm of the parameters should be used instead. Discrete decision variables can be handled directly through binary (or n-ary) encoding. When functions can be expected to be locally monotone, the use of Gray coding is known to better exploit that monotony.
B. Constraints Most optimization problems are constrained in some way. GAS can handle constraints in two ways,the most efficient of whichis by embedding these in the coding of the chromosomes. When this is not possible, the performance of invalid individuals should be calculated according to a penalty function, which ensures that these individuals are, indeed, poor performers. Appropriate penalty functions for a particular problem are not necessarily easy to design, sincetheymay considerably affect the efficiencyof the genetic search. C. MultiobjectiveDecisionProblems Optimization problems very seldom require the optimization of a single objective function.Instead,thereare often competing objectives,which should be optimized simultaneously. In opposition to single objective optimization problems, the solution for a multiobjective optimization problem is not a single solution but a set of nondominated solutions. The task of finding this set of solutions is not always an easy one. GAShave the potential to become a powerful method for multiobjective optimization, keeping a population of solutions, and being able to search for nondominated solutions in parallel. D.OtherGAVariants The simple genetic algorithm has been improved in several ways. Different selection methods have been proposed [4] that reduce the stochastic errors associated with roulette wheel selection. Ranking hasbeen introduced as an alternative to proportional fitness assignment, and has been shown to help avoidance of premature convergence and to speed up the search when the population approaches convergence. Other recombination operators have
Chapter 13
450
been proposed, such as the multiple point and reduced-surrogate crossover. The mutation operator has remained more or less unaltered, but the use of real-coded chromosomes requires alternative mutation operators, such as intermediate crossover.Also,severalmodelsofparallel GAS havebeen proposed, improving the performance and allowing the implementation of concepts such as that of genetic isolation. This method works well with bit string representation. The performance of geneticalgorithms depends on the performance of the crossover operator used. The crossover rate p c is definedas the ratioof the number of offspring produced in each generation to the population size (denoted pop size). A higher crossover rate allows exploration of more of the solution space and reduces the chances of settling for a false optimum; but if this rate is too high, a lot of computational time will be wasted. Mutation is a background operator that produces spontaneous random changes in various chromosomes. A simple way to achieve mutation would be to change one or more genes. In the genetic algorithm, mutation serves the crucial role of either: 1. Replacing the genes lost from the population during the selection process so that they can be tried in a new context, or 2. Providing the genes that were not present in the initial population. The mutation rate p m is defined as the percentage of the total number of genesin the population and it controls the rate at whichnewgenes are introduced into the population for trial. If it is too low, many genes that would have been useful are never tried. But if it is too high, there will be many random populations, the offspring will start losing their resemblance to the parents, and the algorithm will lose the ability to learn from the history of the search. Genetic algorithms differ from conventional optimization and search procedures in several fundamental ways. Goldberg has summarized these as follows. 1. GAS work with a coding of solution sets, not the solutions themselves. 2. GAS search from a population of solutions, not a single solution. 3. GAS use payoff information (fitness function), not derivatives or other auxiliary knowledge. 4. GAS use probabilistic transition rules, not deterministic rules. E. Coding Each chromosome represents a potential solution for the problem and must be expressed in binary form in the integer interval I = [0, 211. We could
Genetic
451
simply code X in binary base, using four bits (such as 1001 or 0101). If we have a set of binary variables, a bit will represent each variable. For a multivariable problem, each variable has to be coded in the chromosome. F. Fitness Each solution must be evaluated by a fitness function to produce a specific value. This objective function is used to model and characterize theproblem to be solved. In many instances, the fitness function canbe simulated as the objective function used in classical optimization problems. In such cases, these optimization problems may be unconstrained or constrained. For the latter case, a Lagrangian or penalty approach canbe used in formulating a suitable fitness function. Notably, the fitness function does not necessarily have to be in closed mathematical form. It can also be expressed in quantitative form and, in power systems applications, with fuzzy models. G.
Selection
The selection operator creates new populations or generations by selecting individuals from the old population. Theselection isprobabilistic butbiased towards the best as special deterministic rules are used. In the new generations createdby the selection operator, therewill be more copies of the best individuals and fewer copies of the worst. Two common techniques for implementing the selection operator are the stochastic tournament and roulette wheel approaches [4]. Stochastic Tournament. This implementation is suited to distributed implementations and isverysimple:every time we want to select an individual for reproduction, we choose two, at random, and the best wins with some fixed reliability, typically 0.8. This scheme can be enhanced by using more individuals in the competitionor even considering evolving winning probability. RouletteWheel. In this process, the individuals of each generation are selected for survival into the next generation accordingto a probability value proportional to the ratio of individual fitness over total population fitness; this means that on average the next generation will receive copies of an individual in proportion to the importance of its fitness value.
H. Crossover The recombination in the canonical genetic algorithm is called single point crossover. Individuals are paired at random with a high probability that
Chapter 13
452
crossover will take place. In the affirmative case,a crossover point is selected at random and, say, the rightmost segmentsofeach individual are exchnaged to produce two offspring. Crossover in the canonical genetic algorithm mutation consists of simply flipping each individual bit with a very low probability (a typical value would be Pnt = 0.001). This background operator is used to ensure that the probability of searching a particular subspace of the problem space is never zero, thereby tending to inhibit the possibility of ending the search at a local, rather than a global, optimum. 1.
Parameters
Like other optimization methods, GAS have certain parameters such as 1. Population size, 2. Genetic operations probabilities, and 3. Number of individuals involved in the selection procedure, and so on. These parameters must be selected with maximumcare, for the performance of the- GA depends largely on the values used. Normally, the useof a relatively low population number, high crossover, and low mutation probabilities are recommended. Goldberg [4] analyzes the effect of these parameters in the algorithms.
V.
THESCHEMATATHEOREM
Genetic algorithms work based on the concept and theory of schema. A schema isa similarity template describing a subset of strings with similarities at certain string positions. If, without loss of generality, we consider only chromosomes represented with binary genesin{O,l}, a schema could be H =* 00 1* 1, where the character is * is a "wild card,'' meaning that the value of 0 or 1 at such a position is undefined. The strings A = 100101 and B = 0001 1 1 both include the schema H because the string alleles match the schema positions 2, 3, 4, and 6 . For binary strings or chromosomes of length L (number of bits or alleles), the number of schemata is 3L. But the schemata have distinct relevance; a schema O***** is more vague than 0 1 1* 1* in representing similarities between chromosomes; and a schema 1****0spans a larger portion of the string than the schema 1**0**. A schema H maybe characterized by its order o(H), whichisthe number of its fixed positions and by its defining length d(H), which is the
h-r.*"rrr~L."r-i,"
_...
.,Lk.,~+.,":+s,,d
Genetic
453
distance between its first and its last fixed position.Forthe schema G = 1**0**, wehave o(G) = 2 and d(G) = 3. We now reason about the effect of reproductiononthe expected number of different schemata in a population. Suppose that at a given time step t (a given generation) there arem examples of a particular schema H in a population P(t); we have m = m(H, t). Reproduction generates a copy of string Ai with probability Pi =f;./ Cf;. (assuming a sampling process known as roulette). After the process of retaining from the populationA(t) a nonoverlapping population of sizen with replacement, there is an expectation of having in the population A(t l), at time t 1, a number m ( H , t 1) of representatives of the schema H given by
+
m(H,t
+
+
+ 1) = m(H, t )f- ,( H )
(13.1)
cfi
where f(H) is the average fitness of the chromosomes including the schema at H at time t. If we introduce theaverage fitness ofthe entire population as
we can write m(H, t
+ 1) = m(H, t )f- .( H )
(1 3.2)
fav
This means that a particular schema replicates in the population proportionally to the ratio of the average fitness of the schema by the average fitness of the population. So, schemata that have associated an average fitness above the populationaverage will have more copies in the following generation, while those with average fitness below the population average will have a smaller number of copies. Suppose now that a given schema remains with average fitness above the population average by an amount Cfav with c constant; we could then rewrite equation 13.2 as equation 13.3, m(H, t
+ 1) = m(H, t )
= (1
+ C)m(H, t).
(1 3.3)
fav
Assuming a stationary value of c, we obtain m(H, t ) = m(H, 0)(1
+ 0'.
(1 3.4)
The effectis clear: an exponential replication in a population of aboveaverage schemata. A schema may be disrupted by crossover, if the crossover point falls within the defining length spanned by the schemata (we reason with single-
Chapter 13
454
point crossover to keep it simple). The survival probability ofaschema under a crossover operation performed with probability PC is (1 3.5) Combining reproduction and crossover, we can write the following estirnation as shown in equation 13.6, m(H, t
+ 1) 2 nz(H,
(1 3.6) fav
Weseenow that the survival of a schema under reproduction and crossover depends on whether it is above or below the population average and whether it has a short or long definition length. To add the effect of admitted mutation, randomly affecting a single position with probability Pnl we must notice that a schema survives if each of its o(H) fixed positions remains unaffected by mutation. Therefore, the probability of the which can beapproximated by surviving mutation is (1 - I'n,)"'w, equation 13.7, 1 - O(H)Pm for Pm