Computer System Design: System-on-Chip
Buy Rights Online Buy Rights

Rights Contact Login For More Details

  • Wiley

More About This Title Computer System Design: System-on-Chip

English

The next generation of computer system designers will be less concerned about details of processors and memories, and more concerned about the elements of a system tailored to particular applications. These designers will have a fundamental knowledge of processors and other elements in the system, but the success of their design will depend on the skills in making system-level tradeoffs that optimize the cost, performance and other attributes to meet application requirements. This book provides a new treatment of computer system design, particularly for System-on-Chip (SOC), which addresses the issues mentioned above. It begins with a global introduction, from the high-level view to the lowest common denominator (the chip itself), then moves on to the three main building blocks of an SOC (processor, memory, and interconnect). Next is an overview of what makes SOC unique (its customization ability and the applications that drive it). The final chapter presents future challenges for system design and SOC possibilities.

English

Michael J. Flynn, Emeritus Professor of Electrical Engineering at Stanford University, is Chairman of the Board and Senior Advisor to Maxeler Technologies. Previously, he worked at IBM in the areas of computer organization and design. His best-known technical work includes the SIMD/MIMD classification of computer organization, and the first detailed discussion of superscalar design. Professor Flynn is a Fellow of the IEEE and a Fellow of the ACM.

Wayne Luk is Professor of Computer Engineering in the Department of Computing at Imperial College London, where he teaches computer architecture and custom computing. He leads the Computer Systems Section as well as the Custom Computing Research Group, which is currently focusing on theory and practice of reconfigurable systems and their design automation. He has worked with many companies including Altera, J.P. Morgan, Nokia, Sharp, Sony, and Xilinx. Professor Luk is a Fellow of the IEEE and a Fellow of the BCS.

English

Preface xiii

List of Abbreviations and Acronyms xvii

1 Introduction to the Systems Approach 1

1.1 System Architecture: An Overview 1

1.2 Components of the System: Processors, Memories, and Interconnects 2

1.3 Hardware and Software: Programmability Versus Performance 5

1.4 Processor Architectures 7

1.4.1 Processor: A Functional View 8

1.4.2 Processor: An Architectural View 9

1.5 Memory and Addressing 19

1.5.1 SOC Memory Examples 20

1.5.2 Addressing: The Architecture of Memory 21

1.5.3 Memory for SOC Operating System 22

1.6 System-Level Interconnection 24

1.6.1 Bus-Based Approach 24

1.6.2 Network-on-Chip Approach 25

1.7 An Approach for SOC Design 26

1.7.1 Requirements and Specifi cations 26

1.7.2 Design Iteration 27

1.8 System Architecture and Complexity 29

1.9 Product Economics and Implications for SOC 31

1.9.1 Factors Affecting Product Costs 31

1.9.2 Modeling Product Economics and Technology Complexity: The Lesson for SOC 33

1.10 Dealing with Design Complexity 34

1.10.1 Buying IP 34

1.10.2 Reconfi guration 35

1.11 Conclusions 37

1.12 Problem Set 38

2 Chip Basics: Time, Area, Power, Reliability, and Confi gurability 39

2.1 Introduction 39

2.1.1 Design Trade-Offs 39

2.1.2 Requirements and Specifi cations 42

2.2 Cycle Time 43

2.2.1 Defi ning a Cycle 43

2.2.2 Optimum Pipeline 44

2.2.3 Performance 46

2.3 Die Area and Cost 47

2.3.1 Processor Area 47

2.3.2 Processor Subunits 50

2.4 Ideal and Practical Scaling 53

2.5 Power 57

2.6 Area–Time–Power Trade-Offs in Processor Design 60

2.6.1 Workstation Processor 60

2.6.2 Embedded Processor 61

2.7 Reliability 62

2.7.1 Dealing with Physical Faults 62

2.7.2 Error Detection and Correction 65

2.7.3 Dealing with Manufacturing Faults 68

2.7.4 Memory and Function Scrubbing 69

2.8 Confi gurability 69

2.8.1 Why Reconfi gurable Design? 69

2.8.2 Area Estimate of Reconfi gurable Devices 70

2.9 Conclusion 71

2.10 Problem Set 71

3 Processors 74

3.1 Introduction 74

3.2 Processor Selection for SOC 76

3.2.1 Overview 76

3.2.2 Example: Soft Processors 76

3.2.3 Examples: Processor Core Selection 79

3.3 Basic Concepts in Processor Architecture 81

3.3.1 Instruction Set 81

3.3.2 Some Instruction Set Conventions 82

3.3.3 Branches 82

3.3.4 Interrupts and Exceptions 84

3.4 Basic Concepts in Processor Microarchitecture 86

3.5 Basic Elements in Instruction Handling 88

3.5.1 The Instruction Decoder and Interlocks 88

3.5.2 Bypassing 90

3.5.3 Execution Unit 90

3.6 Buffers: Minimizing Pipeline Delays 91

3.6.1 Mean Request Rate Buffers 91

3.6.2 Buffers Designed for a Fixed or Maximum Request Rate 92

3.7 Branches: Reducing the Cost of Branches 93

3.7.1 Branch Target Capture: Branch Target Buffers (BTBs) 94

3.7.2 Branch Prediction 97

3.8 More Robust Processors: Vector, Very Long Instruction Word (VLIW), and Superscalar 101

3.9 Vector Processors and Vector Instruction Extensions 101

3.9.1 Vector Functional Units 103

3.10 VLIW Processors 107

3.11 Superscalar Processors 108

3.11.1 Data Dependencies 109

3.11.2 Detecting Instruction Concurrency 110

3.11.3 A Simple Implementation 112

3.11.4 Preserving State with Out-of-Order Execution 116

3.12 Processor Evolution and Two Examples 118

3.12.1 Soft and Firm Processor Designs: The Processor as IP 118

3.12.2 High-Performance, Custom-Designed Processors 118

3.13 Conclusions 119

3.14 Problem Set 120

4 Memory Design: System-on-Chip and Board-Based Systems 123

4.1 Introduction 123

4.2 Overview 125

4.2.1 SOC External Memory: Flash 125

4.2.2 SOC Internal Memory: Placement 126

4.2.3 The Size of Memory 127

4.3 Scratchpads and Cache Memory 128

4.4 Basic Notions 129

4.5 Cache Organization 130

4.6 Cache Data 133

4.7 Write Policies 134

4.8 Strategies for Line Replacement at Miss Time 135

4.8.1 Fetching a Line 136

4.8.2 Line Replacement 136

4.8.3 Cache Environment: Effects of System, Transactions, and Multiprogramming 137

4.9 Other Types of Cache 138

4.10 Split I- and D-Caches and the Effect of Code Density 138

4.11 Multilevel Caches 139

4.11.1 Limits on Cache Array Size 139

4.11.2 Evaluating Multilevel Caches 140

4.11.3 Logical Inclusion 143

4.12 Virtual-to-Real Translation 143

4.13 SOC (On-Die) Memory Systems 145

4.14 Board-based (Off-Die) Memory Systems 147

4.15 Simple DRAM and the Memory Array 149

4.15.1 SDRAM and DDR SDRAM 152

4.15.2 Memory Buffers 156

4.16 Models of Simple Processor–Memory Interaction 156

4.16.1 Models of Multiple Simple Processors and Memory 157

4.16.2 The Strecker-Ravi Model 158

4.16.3 Interleaved Caches 160

4.17 Conclusions 161

4.18 Problem Set 161

5 Interconnect 165

5.1 Introduction 165

5.2 Overview: Interconnect Architectures 166

5.3 Bus: Basic Architecture 168

5.3.1 Arbitration and Protocols 170

5.3.2 Bus Bridge 171

5.3.3 Physical Bus Structure 171

5.3.4 Bus Varieties 172

5.4 SOC Standard Buses 173

5.4.1 AMBA 174

5.4.2 CoreConnect 177

5.4.3 Bus Interface Units: Bus Sockets and Bus Wrappers 179

5.5 Analytic Bus Models 183

5.5.1 Contention and Shared Bus 183

5.5.2 Simple Bus Model: Without Resubmission 184

5.5.3 Bus Model with Request Resubmission 185

5.5.4 Using the Bus Model: Computing the Offered Occupancy 185

5.5.5 Effect of Bus Transactions and Contention Time 186

5.6 Beyond the Bus: NOC with Switch Interconnects 187

5.6.1 Static Networks 190

5.6.2 Dynamic Networks 192

5.7 Some NOC Switch Examples 194

5.7.1 A 2-D Grid Example of Direct Networks 194

5.7.2 Asynchronous Crossbar Interconnect for Synchronous SOC (Dynamic Network) 196

5.7.3 Blocking versus Nonblocking 197

5.8 Layered Architecture and Network Interface Unit 197

5.8.1 NOC Layered Architecture 198

5.8.2 NOC and NIU Example 200

5.8.3 Bus versus NOC 201

5.9 Evaluating Interconnect Networks 201

5.9.1 Static versus Dynamic Networks 202

5.9.2 Comparing Networks: Example 204

5.10 Conclusions 205

5.11 Problem Set 206

6 Customization and Confi gurability 208

6.1 Introduction 208

6.2 Estimating Effectiveness of Customization 209

6.3 SOC Customization: An Overview 210

6.4 Customizing Instruction Processors 212

6.4.1 Processor Customization Approaches 214

6.4.2 Architecture Description 215

6.4.3 Identifying Custom Instructions Automatically 217

6.5 Reconfi gurable Technologies 218

6.5.1 Reconfi gurable Functional Units (FUs) 218

6.5.2 Reconfi gurable Interconnects 222

6.5.3 Software Confi gurable Processors 224

6.6 Mapping Designs Onto Reconfi gurable Devices 226

6.7 Instance-Specifi c Design 228

6.8 Customizable Soft Processor: An Example 231

6.9 Reconfi guration 235

6.9.1 Reconfi guration Overhead Analysis 235

6.9.2 Trade-Off Analysis: Reconfi gurable Parallelism 237

6.10 Conclusions 242

6.11 Problem Set 243

7 Application Studies 246

7.1 Introduction 246

7.2 SOC Design Approach 246

7.3 Application Study: AES 251

7.3.1 AES: Algorithm and Requirements 251

7.3.2 AES: Design and Evaluation 253

7.4 Application Study: 3-D Graphics Processors 254

7.4.1 Analysis: Processing 255

7.4.2 Analysis: Interconnection 259

7.4.3 Prototyping 260

7.5 Application Study: Image Compression 262

7.5.1 JPEG Compression 262

7.5.2 Example JPEG System for Digital Still Camera 264

7.6 Application Study: Video Compression 266

7.6.1 MPEG and H.26X Video Compression: Requirements 268

7.6.2 H.264 Acceleration: Designs 271

7.7 Further Application Studies 276

7.7.1 MP3 Audio Decoding 276

7.7.2 Software-Defi ned Radio with 802.16 279

7.8 Conclusions 281

7.9 Problem Set 282

8 What's Next: Challenges Ahead 285

8.1 Introduction 285

8.2 Overview 286

8.3 Technology 288

8.4 Powering the ASOC 289

8.5 The Shape of the ASOC 292

8.6 Computer Module and Memory 293

8.7 RF or Light Communications 293

8.7.1 Lasers 294

8.7.2 RF 295

8.7.3 Potential for Laser/RF Communications 295

8.7.4 Networked ASOC 296

8.8 Sensing 296

8.8.1 Visual 296

8.8.2 Audio 297

8.9 Motion, Flight, and the Fruit Fly 298

8.10 Motivation 299

8.11 Overview 300

8.12 Pre-Deployment 302

8.13 Post-Deployment 307

8.13.1 Situation-Specifi c Optimization 308

8.13.2 Autonomous Optimization Control 309

8.14 Roadmap and Challenges 310

8.15 Summary 312

Appendix: Tools for Processor Evaluation 313

References 316

Index 329

loading