Rights Contact Login For More Details
- Wiley
More About This Title Google BigQuery Analytics
- English
English
Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addition to the mechanics of BigQuery, the book also covers the architecture of the underlying Dremel query engine, providing a thorough understanding that leads to better query results.
- Features a companion website that includes all code and data sets from the book
- Uses real-world examples to explain everything analysts need to know to effectively use BigQuery
- Includes web application examples coded in Python
- English
English
The authors are founding members of the BigQuery team and have helped build and run the service. Jordan Tigani is an active participant in the BigQuery StackOverflow virtual community. Siddartha Naidu has extensive experience helping customers integrate with BigQuery.
- English
English
Part I BigQuery Fundamentals
Chapter 1 The Story of Big Data at Google 3
Big Data Stack 1.0 4
Big Data Stack 2.0 (and Beyond) 5
Open Source Stack 7
Google Cloud Platform 8
Cloud Processing 9
Cloud Storage 9
Cloud Analytics 9
Problem Statement 10
What Is Big Data? 10
Why Big Data? 10
Why Do You Need New Ways to Process Big Data? 11
How Can You Read a Terabyte in a Second? 12
What about MapReduce? 12
How Can You Ask Questions of Your Big Data and Quickly
Get Answers? 13
Summary 13
Chapter 2 BigQuery Fundamentals 15
What Is BigQuery? 15
SQL Queries over Big Data 16
Cloud Storage System 21
Distributed Cloud Computing 23
Analytics as a Service (AaaS?) 26
What BigQuery Isn’t 29
BigQuery Technology Stack 31
Google Cloud Platform 34
BigQuery Service History 37
BigQuery Sensors Application 39
Sensor Client Android App 40
BigQuery Sensors AppEngine App 41
Running Ad-Hoc Queries 42
Summary 43
Chapter 3 Getting Started with BigQuery 45
Creating a Project 45
Google APIs Console 46
Free Tier Limitations and Billing 49
Running Your First Query 51
Loading Data 54
Using the Command-Line Client 57
Install and Setup 58
Using the Client 60
Service Account Access 62
Setting Up Google Cloud Storage 64
Development Environment 66
Python Libraries 66
Java Libraries 67
Additional Tools 67
Summary 68
Chapter 4 Understanding the BigQuery Object Model 69
Projects 70
Project Names 70
Project Billing 72
Project Access Control 72
Projects and AppEngine 73
BigQuery Data 73
Naming in BigQuery 73
Schemas 75
Tables 76
Datasets 77
Jobs 78
Job Components 78
BigQuery Billing and Quotas 85
Storage Costs 85
Processing Costs 86
Query RPCs 87
TableData.insertAll() RPCs 87
Data Model for End-to-End Application 87
Project 87
Datasets 88
Tables 89
Summary 91
Part II Basic BigQuery 93
Chapter 5 Talking to the BigQuery API 95
Introduction to Google APIs 95
Authenticating API Access 96
RESTful Web Services for the SOAP-Less Masses 105
Discovering Google APIs 112
Common Operations 113
BigQuery REST Collections 122
Projects 123
Datasets 126
Tables 132
TableData 139
Jobs 144
BigQuery API Tour 151
Error Handling in BigQuery 154
Summary 158
Chapter 6 Loading Data 159
Bulk Loads 160
Moving Bytes 163
Destination Table 170
Data Formats 174
Errors 182
Limits and Quotas 186
Streaming Inserts 188
Summary 193
Chapter 7 Running Queries 195
BigQuery Query API 196
Query API Methods 196
Query API Features 208
Query Billing and Quotas 213
BigQuery Query Language 221
BigQuery SQL in Five Queries 222
Differences from Standard SQL 232
Summary 236
Chapter 8 Putting It Together 237
A Quick Tour 238
Mobile Client 242
Monitoring Service 243
Log Collection Service 252
Log Trampoline 253
Dashboard 260
Data Caching 261
Data Transformation 265
Web Client 269
Summary 272
Part III Advanced BigQuery 273
Chapter 9 Understanding Query Execution 275
Background 276
Storage Architecture 277
Colossus File System (CFS) 277
ColumnIO 278
Durability and Availability 281
Query Processing 282
Dremel Serving Trees 283
Architecture Comparisons 295
Relational Databases 295
MapReduce 298
Summary 303
Chapter 10 Advanced Queries 305
Advanced SQL 306
Subqueries 307
Combining Tables: Implicit UNION and JOIN 310
Analytic and Windowing Functions 315
BigQuery SQL Extensions 318
The EACH Keyword 318
Data Sampling 320
Repeated Fields 324
Query Errors 334
Result Too Large 334
Resources Exceeded 337
Recipes 338
Pivot 339
Cohort Analysis 340
Parallel Lists 343
Exact Count Distinct 344
Trailing Averages 346
Finding Concurrency 347
Summary 348
Chapter 11 Managing Data Stored in BigQuery 349
Query Caching 349
Result Caching 350
Table Snapshots 354
AppEngine Datastore Integration 358
Simple Kind 359
Mixing Types 366
Final Thoughts 368
Metatables and Table Sharding 368
Time Travel 368
Selecting Tables 374
Summary 378
Part IV BigQuery Applications 381
Chapter 12 External Data Processing 383
Getting Data Out of BigQuery 384
Extract Jobs 384
TableData.list() 396
AppEngine MapReduce 405
Sequential Solution 407
Basic AppEngine MapReduce 409
BigQuery Integration 412
Using BigQuery with Hadoop 418
Querying BigQuery from a Spreadsheet 419
BigQuery Queries in Google Spreadsheets (Apps Script) 419
BigQuery Queries in Microsoft Excel 429
Summary 433
Chapter 13 Using BigQuery from Third-Party Tools 435
BigQuery Adapters 436
Simba ODBC Connector 436
JDBC Connection Options 444
Client-Side Encryption with Encrypted BigQuery 445
Scientifi c Data Processing Tools in BigQuery 452
BigQuery from R 452
Python Pandas and BigQuery 461
Visualizing Data in BigQuery 467
Visualizing Your BigQuery Data with Tableau 467
Visualizing Your BigQuery Data with BIME 473
Other Data Visualization Options 477
Summary 478
Chapter 14 Querying Google Data Sources 479
Google Analytics 480
Setting Up BigQuery Access 480
Table Schema 481
Querying the Tables 483
Google AdSense 485
Table Structure 486
Leveraging BigQuery 490
Google Cloud Storage 491
Summary 494
Index 495