Google Cloud Platform

The Google Cloud Platform (GCP) is a portfolio of cloud computing services and solutions, originally based around the initial Google App Engine framework for hosting web applications from Google’s data centers. (The Google App Engine was originally launched in 2008). GCP is now widely regarded as one of the top three premier cloud computing platforms available. However, it still trails Amazon Web Services (AWS) and Microsoft Azure in market share. GCP’s pricing models are very different from those of AWS or Azure.

Following the introduction of Google App Engine, Google later released a variety of complementary tools, including a data storage layer, and Google Compute Engine, which is Infrastructure as a Service (IaaS), and supports the use of virtual machines. Once establishing itself as an IaaS provider, Google added additional products including;

a load balancer,
DNS, monitoring tools, and
data analysis services

This brought GCP closer to functional parity with AWS and Azure, making them much more competitive in the cloud market.

Even though it has drawn closer to the functionality offered by AWS, GCP is no ‘cookie-cutter’ version of AWS. GCP apparently seeks to differentiate itself, through a hybrid cloud and multi-cloud strategy.

CONTACT US

Google Cloud

Specific Tools & Applications

We will now look at specific services, tools and applications by functional area. The insights and details provided are often rather different , and more specific, than the ‘architectural framework’ perspective delivered at the GCP-1 page.

Options groups assigned to each of these solutions scenarios may be found next;

The next graphic, below, provides a summary of Google Compute Platform’s services, tools, and applications, by functional area, with the following technical/functional areas referenced;

Compute
Management
Networking
Storage & Databases
Big Data
Identity and Security &
Machine Learning

**GCP Services, Tools, & Applications by functional area**

The the table below lists every service, tool, & application by functional area, matching the content in the graphic immediately above. Clicking on each of the links associated with headers and content items below, will take the reader to a detailed review of each area;

Compute

Compute Engine
Kubernetes Engine
App Engine
Cloud Functions

management

Cloud Console
Stackdriver
Trace
Logging
Debugging
Monitoring

networking

Cloud Load Balancing
Cloud CDN
Cloud DNS
Cloud Firewall Rules
Cloud Interconnect
Cloud VPN

Storage & Database

Cloud Bigtable
Cloud Datastore
Cloud Spanner
Cloud SQL
Cloud Storage

big datA

Big Query
Cloud Dataflow
Cloud Dataprep
Cloud Dataproc
Cloud IoT Core
Cloud Pub/Sub

identity & security

Cloud IAM
Cloud Endpoints
VPC
Identity Aware Proxy
KMS
Data Loss Prevention

machine Learning

Cloud ML
Natural Language API
Cloud Speech API
Cloud Vision API
Cloud Translate API

Google Cloud Compute Services

Google Cloud Compute Services consists of four components;

Cloud Functions
App Engine
Kubernetes Engine
Compute Engine

Each of these abstracts a different part of the solutions architecture, as follows;

Cloud Functions abstracts the application layer, and provides a control surface for service invocations

App Engine abstracts the infrastructure, and provides a control surface at the application layer

Kubernetes Engine abstracts the Virtual Machines (VM’s), and provides a control surface for managing Kubernetes cluster and related hosted containers

Compute Engine abstracts the underlying hardware and provides a control surface for infrastructure components

CONTACT US

Before we expand on each of the four families of architectural frameworks, and the twenty one options they include, we want to provide a preview of the graphics we’ll be using, next;

We’ll now perform a detailed review of these four Architectural Frameworks, the twenty-one options they include, and the one-hundred-fourteen specific solutions they cover, next;

Compute Engine

GCP’s Compute Engine is Google’s Infrastructure-as-a-Service ( IaaS ) offering that ;

facilitates the creation of Virtual Machines (VM‘s)
allocation and assignment of CPU and memory

By default, each Compute Engine instance has a single boot persistent disk (PD) that contains the operating system. Persistent disks(PD’s) are durable network storage devices that instances can access ‘like’ physical disks in a desktop or a server. The data on each persistent disk is distributed across several physical disks. Google Compute Engine(GCE) manages the physical disks and the data distribution automatically to ensure redundancy and optimal performance.

Applications on GCE commonly require additional storage space. Administrators can add one or more additional storage solutions to their instance. The eight available options include;

Zonal standard PD
Regional standard PD
Zonal balanced PD
Regional balanced PD
Zonal SSD PD
Regional SSD PD
Local SSDs
Cloud Storage bucket

The first seven options listed can be used to satisfy block storage requirements. The final option, ‘Cloud storage bucket‘, can only be used for object storage and not block storage.

Compute Engine is used to support a wide variety of architectural solutions referenced at GCP-1, in fact effectively all of them. These will not be listed here again.

Kubernetes Engine

GCP’s Kubernetes Engine (GKE) provides a sophisticated managed environment for deploying, controlling, and scaling containerized applications using GCP’s infrastructure. The GKE environment is constructed of multiple machines, always Compute Engine virtual machine instances, that are grouped together to form a cluster.

GKE clusters are powered by the Kubernetes open source cluster management system, which was originally developed by Google. Kubernetes is now under the control of the “Cloud Native Computing Foundation”, a project of the Linux Foundation. Note however, that Kubernetes can run on Microsoft Windows Server. Specifically, this is accomplished by installing Kubernetes on a Windows Server ‘worker node’, linked to a Linux control plane node.

Kubernetes is a container orchestrator. In order for a Kubernetes cluster to run, it must use a ‘container runtime’. Kubernetes is commonly used with the Docker container runtime, but it can be used with any compatible container runtime, including but not limited to RunC, cri-o, and containerd.

Google makes heavy use of Kubernetes and related design principles, to deliver commonly accessed Google services. Kubernetes benefits include the following;

automatic management,
monitoring and availability probes for application containers,
automatic scaling,
rolling updates,
etc.

Execution of a GKE cluster on the Google Cloud Compute Engine delivers a host of advanced cluster management features, including:

Compute Engine load-balancing
Availability of Node pools
Automatic scaling
Automatic upgrades to node cluster software
Custer node auto-repair
Logging and monitoring via the Google Cloud’s operations suite

The table below provides a review of specific uses cases/deployment configurations referenced in the Google Cloud ‘Architectural Framework & Solution Scenarios’ discussed on this website at GCP-1 ;

Architectural Framework

Architectural Use Case References (GCP-1 webpage)

Security & Compliance Options /

Binary K8S Auth

Appdev /

Microservices with GKE

*Ensures only trusted container images are deployed on Google Kubernetes

*Containerized microservices. Auto-scaling, auto-upgrade, auto-repair, via Google SRE’s

App Engine

App Engine is a fully managed, completely serverless platform for developing and hosting web applications delivered at scale. Multiple languages, libraries, and frameworks are available to develop desired applications. App Engine automatically handles provisioning servers and scaling assigned app instances based on user/system demand.

In accordance with the automatic provisioning of servers and scaling, App Engine abstracts the infrastructure, and provides the control surface at the application layer

Languages/frameworks/runtimes supported include;

Go
PHP
Java
Python
Node.js
.NET
Ruby

Architectural Framework

Architectural Use Case References (GCP-1 webpage)

AppDev /

Microservices with App Engine

AppDev /

Mobile Site Hosting

*App Engine Standard(PaaS). Python, Java, Go, NodeJS, PHP runtimes

*Firebase linked to App Engine as a backend app.

*Firebase linked to App Engine as a backend app.

*Firebase syncs across iOS, Android, Web. Processes data via App Engine.

Cloud Functions

GCP Cloud Functions delivers a lightweight compute solution for developers to create single-purpose, stand-alone functions that respond to Cloud events without any requirements to manage a server or runtime environment.

Cloud Functions can be written in Node.js, Python, Go, and Java, and are executed in language-specific runtimes.

The specific runtimes supported include;

Node.js 8 Runtime
Node.js 10 Runtime
Python Runtime
Go Runtime
Java Runtime

There are two distinct types of Cloud Functions: HTTP functions and background functions

HTTP functions are invoked from standard HTTP requests. These HTTP requests wait for the response and support handling of common HTTP request methods like GET, PUT, POST, DELETE and OPTIONS. Deployment of Cloud Functions, triggers the automatic provisioning of a TLS certificate, so all HTTP functions can be invoked via a secure connection.

The other option, background functions are used to handle events from the Cloud infrastructure, such as messages on a Pub/Sub topic, or changes in a Cloud Storage bucket.

Architectural Framework

Architectural Use Case References (GCP-1 webpage)

AppDev &

Serverless /

*Serverless Web Scraping w/ Cloud Functions. Event-driven web scraping w/ Cloud Functions, Firestore & Scheduler. Built-in support for Headless Chrome, providing sophisticated UI testing & web scraping.

*Serverless, scalable, event-driven web scraping w/ Cloud Functions, Firestore & Scheduler

Cloud Console

GCP’s Cloud Console is a sophisticated web administration user-interface.
It provides the ability to access, view, and manage all facets of GCP’s cloud applications— including but not limited to;

web applications
data analysis
virtual machines
databases
datastore
networking &
developer services.

GCP’s Cloud Console allows an administrator to deploy, scale, and diagnose production issues via the web-based interface.

An administrator may search to quickly find resources and connect to instances via SSH in the browser.

DevOps workflows can be administered when away from the office with built-in native iOS and Android applications.

Development tasks are accomplished via Cloud Shell.

Cloud Console Options

Architectural Use Case References (GCP-1 webpage)

Cloud Console is universally used across GCP applications

Stackdrive is now Operations

GCP Stackdriver (now Operations) has been fully merged into the Cloud Console. It has been replaced with the Operations suite of products, which include;

Cloud Logging
Cloud Monitoring
Cloud Trace
Cloud Debugger
Cloud Profiler

Stackdriver is now GCP Operations

Retail & eCommerce / PCI = Payment Card Industry

*Cloud Monitoring w/ StackDriver, Big Query & Cloud Logging.

Trace

GCP Trace, now called Cloud Trace, is now part of GCP’s Operations Suite.

Cloud Trace is a distributed tracing system that collects latency data from cloud applications and transfers it for display in the Google Cloud Console.

Using Cloud Trace, it is possible to review how requests propagate through an application, providing detailed quasi real-time performance insights.

Cloud Trace systematically reviews all of an application’s traces to generate in-depth latency reports.

Cloud Trace provides the capability to capture traces from all of a project’s VMs, containers, or App Engine projects.

Cloud Trace provides a variety of tools and filters you can quickly find the source and root cause of bottlenecks. This tool also continuously retrieves and analyzes trace data from a project, to highlight recent changes to performance, outputted to latency distributions. Latency distributions, obtained via Analysis Reports, can be compared by time-axis or version type. Significant changes in an application’s latency profile are automatically signaled via alerts.

Cloud Trace provides language-specific SDKs, and is currently available for;

Java
Node.js
Ruby
Go

Cloud Trace can analyze projects running on VM’s, regardless of whether they are maintained by the GCP. The Cloud Trace API can be used to submit and retrieve trace data from any source. . All projects running on App Engine are automatically captured.

Operations Cloud Trace

Cloud Trace is part of GCP Operations Suite

Use across GCP applications written in;

*Java
Node.js
Ruby
Go

Cloud Logging

GCP Logging, now called Cloud Logging, is now part of GCP Operations Suite.

Cloud Logging allows an administrator to store, search, analyze, monitor, and alert on log data and events from Google Cloud. The API also allows ingestion of any custom log data from any source. Cloud Logging is a fully managed service that delivers at scale and can ingest application and system log data from any number of VMs.

Cloud Logging works seamlessly with;

Cloud Monitoring
Cloud Trace
Error Reporting, and
Cloud Debugger

This linkage allows an administrator to navigate between incidents, charts, traces, errors, and logs. This facilitates determining the root cause of problems in the system/applications.

Cloud Logging is designed with cluster deployment and management built in. As a fully managed solution, it allows the administrator to focus on project construction, and not the details of administration.

Cloud Logging maintains data in one location, regardless of whether your solution is multi-cloud or your projects requires migration to another cloud.

Advanced log analysis, including the development of real time metrics, is delivered via the use of BigQuery. Generated metrics are transferrable to Cloud Monitoring, where they can be used to create dashboards.

Audit logs which capture all admin and data access events within Google Cloud, are retained for 400 days without additional charges. Logs can be stored for longer than 400 days, by exporting them to Cloud Storage.

Cloud Pub/Sub is used for integration with external systems, and the export of logs from GCP to them.

Operations Cloud Logging

Cloud Logging is part of GCP Operations Suite

Cloud Logging is used across GCP

Cloud Debugger

GCP Cloud Debugger is a built-in feature of Google Cloud that allows administrators to inspect the state of a running application in real time, without stopping or slowing it down. Cloud Debugger allows the capture of the call stack and variables at any location in the source code, without impacting response time for users.

Cloud Debugger can be deployed for used with production applications. A snapshot of an application captures the call stack and variables at a specific code location (logpoint) the first time any instance executes that code. This logpoint functions as if it were part of the deployed code, sending the log messages to the same log stream.

Cloud Debugger works with version control systems, such as;

Cloud Source Repositories
GitHub
Bitbucket, or
GitLab.

Cloud Debugger comprehends how to display the correct version of the source code when any of these version control system are used. When additional source repositories are used, the source files can be used part of the build-and-deploy process.

Cloud Debugger allows seamless collaboration with other members of your team, by sharing the debug session via the Console URL.
Cloud Debugger is integrated into existing development workflows. Debugging snapshots can be taken directly from;

Cloud Logging
error reporting
dashboards
integrated development environments(IDE’s), and the
gcloud command-line interface.

Cloud Debugger is automatically enabled for all App Engine applications. It must be manually enabled for Google Kubernetes Engine(GKE) or Compute Engine.

Operations Cloud Debugger

Cloud Debugger is part of GCP Operations Suite

Cloud Monitoring

GCP Cloud Monitoring provides insights into the performance, uptime, and general health of all cloud-associated applications. It collects

metrics
events
metadata

from;

Google Cloud
hosted uptime probes
application instrumentation

and other common application components including

Cassandra
Nginx
Apache Web Server &
Elasticsearch

Cloud Monitoring processes the related data and generates insights via;

dashboards
charts
alerts.

Cloud Monitoring alerting assists with oversight, by integrating with, and receiving notifications from;

SMS
Slack
PagerDuty

Cloud Monitoring provides default dashboards for most Google Cloud services out-of-the-box and incorporates not only metrics but also critical metadata, which provides insights into the relationships between components. Cloud Monitoring also supports monitoring for non-GCP environments through partnerships, agents and API’s. Service Level Objectives (SLO’s) may be defined for applications, with SLO violations triggering alerts.

Lasty, Cloud Monitoring can provides information on the availability and uptime of internet-accessible;

URLs
VMs
APIs, and
load-balancers

Operations Cloud Monitoring

Cloud Monitoring is part of GCP Operations Suite

Cloud Load Balancing

GCP Cloud Load Balancing allows the scaling of applications on Compute Engine from cold to active/hot instantaneously. Cloud Load Balancing supports the management of resources in one region, or multiple regions, to maximize availability and reduce latency. CLB also supports use of a single anycast IP, which front-ends all backend instances worldwide. It also supports intelligent preconfigured autoscaling.

It supports cross-region load balancing, including automatic multi-region failover, which allows for the transfer of traffic, in precisely measured amounts, in the event any backend were to signal performance issues.

Unlike DNS-based global load balancing solutions, Cloud Load Balancing reacts quickly and automatically to changes in users, traffic, network, backend performance, and related conditions.

Cloud Load Balancing is a fully distributed, software-defined, managed service for all traffic worldwide. It is not an instance, hardware or device-based solution, which would deliver a physical load balancing infrastructure along with high availability, and scale challenges.

Cloud Load Balancing can be applied to all traffic, including:

HTTP(S)
TCP/SSL &
UDP

In addition to these standard protocols, Cloud Load Balancing provides support for the latest application delivery protocols, including:

HTTP/2 with gRPC &
QUIC support for GCP’s HTTPS load balancers

Cloud Load Balancing also supports the construction of internal load balancing solutions for internal client instances, without any exposure to the internet. This is accomplished via Andromeda, which is GCP’s
software defined network virtualization platform. Internal load balancing also provides support for clients across VPN’s.

Central management of SSL certificates and decryption is delivered via SSL offload. Encryption can also be enabled between the load balancing layer and the backend.

Cloud Load Balancing Reference

Use Case / Deployment Configuration

Networking /

Latency optimized Travel Sample Architecture

DevOps/ Jenkins on k8s

Serve users from closest region to location, via Google’s Global Cloud Load Balancing

*Jenkins Namespace, Container Registry, Google Load Balancer

Cloud CDN

GCP Cloud CDN (Content Delivery Network) uses GCP’s globally distributed edge points of presence to cache external HTTP(S) load balanced content closer to your recipients. Caching content at the edges of GCP’s network provides faster delivery of content to users while reducing delivery costs.

GCP’s anycast architecture assigns to any given site a single global IP address, facilitating the delivery of consistent performance worldwide with an easier management interface. In addition, GCP’s edge cach(es) are peered with every major ISP end user globally, providing enhanced connectivity to users around the globe.

Consistent with its links to Cloud Load Balancing, Cloud CDN supports enhanced & recently introduced protocols, such as;

HTTP/2 &
QUIC

These two new protocols are targeted at delivering improved site performance for mobile users and/or users in emerging markets.

Cloud CDN is tightly integrated with Cloud Monitoring and Cloud Logging , delivering detailed latency metrics without customization, along with baseline HTTP request logs for deeper insight. These logs can be exported into Cloud Storage, and/or BigQuery for additional analysis with minimal effort.

Cloud CDN Reference

Use Case / Deployment Configuration

Cloud CDN

*Cloud CDN is commonly used with Cloud Load Balancing

Cloud DNS

GCP Cloud DNS is a high-performance, resilient, global Domain Name System (DNS) service that publishes domain names to the global DNS database in an efficient manner.

DNS is a hierarchical distributed database that facilitates the storing of IP addresses and other data, and the retrieval of them by name. Cloud DNS facilitates the publication of zones and records in the DNS, without the burden of managing DNS servers and software that would be your direct responsibility.

Cloud DNS offers both public zones and private managed DNS zones. A public zone is visible to the public internet, while a private zone is visible only from one or more local VPC networks specified.

Cloud DNS Reference

Use Case / Deployment Configuration

Cloud Load Balancing

*Cloud CDN is commonly used with Cloud Load Balancing

Firewall Rules

GCP Firewall Rules are defined at a network level, and only apply to the network where created. Any name chosen for them must be unique to the GCP project. Firewall rules can be applied across an organization, or a specific Virtual Private Cloud (VPC). Our discussion here will focus on application to the VPC.

VPC firewall rules determine whether to allow or deny connections to or from virtual machine (VM) instances, based on the configuration specification. Enabled VPC firewall rules are always enforced, protecting instances regardless of their configuration/operating system, whether or not they have started up.

Every VPC network functions effectively as a distributed firewall. While firewall rules are always defined at the network level, connections are allowed or denied on an instance by instance basis. VPC firewall rules enforce connections not just between a specific instances and other networks, but also between individual instances on the same network.

A VPC firewall rule specifies an individual VPC network and a set of components that define what that rule does. The set of components target certain types of traffic, based on the traffic’s:

protocol
ports
sources &
destinations

You create or modify VPC firewall rules by using:

Cloud Console
gcloud command-line tool &
REST API

The target component of a firewall rule, defines the instances to which it is intended to apply. In addition to created firewall rules, GCP provides other rules that can affect incoming (ingress) or outgoing (egress) connections (not discussed further here). Each firewall rule applies to only an incoming (ingress) or outgoing (egress) connection, never both.

Firewall rules only support IPv4 connections, and not IPv6 connections. When defining a source for an ingress rule or a destination for an egress rule by address, you can only use an IPv4 address or IPv4 block in CIDR notation.

Each firewall rule’s action is either allow or deny . The rule applies to connections during any time period it is enforced . It is possible to disable a rule for troubleshooting purposes.

Once created, a firewall rule must be assigned to a VPC network. While the rule is enforced at the instance level, its configuration is always associated with a VPC network, and specific to that VPC network. Thus, firewall rules cannot be shared among VPC networks, including networks connected by VPC Network Peering or by using Cloud VPN tunnels .

GPC VPC firewall rules are stateful. A stateful firewall is a firewall that monitors the full state of active network connections. Thus, stateful firewalls are constantly analyzing the complete context of traffic and data packets that areseeking entry to a network, rather than discrete traffic and data packets in isolation.

Once a certain kind of traffic has been approved by a stateful firewall, it is added to a state table and can travel more freely into the protected network. Traffic and data packets that don’t successfully complete this required handshake will be blocked. By analyzing multiple factors before adding a type of connection to an approved list, such as TCP stages, stateful firewalls are able to observe traffic streams in their entirety. They are effectively more ‘intelligent’ than stateless firewalls, but are also more susceptible to DDoS attacks than stateless firewalls.

Firewall Rules Reference

Use Case / Deployment Configuration

Universal application

*Firewalls are an integrated and universal component of security for GCP

Cloud Interconnect

GCP Cloud Interconnect provides low latency, highly available connections that enable the reliable transfer of data between an on-premises facility and GCP Virtual Private Cloud (VPC) networks. In addition, Cloud Interconnect connections provide internal IP address linkage, which ensures that internal IP addresses are directly accessible from both networks. It is not necessary to use a NAT device or VPN tunnel to reach internal IP addresses

Cloud Interconnect offers two options for extending your on-premises network.

Dedicated Interconnect &
Partner Interconnect

Dedicated Interconnect provides a direct physical connection between an on-premises network and GCP’s VPC networks.

Partner Interconnect provides connectivity between an on-premises network and GCP VPC networks through a supported service provider.

For both Dedicated Interconnect & Partner Interconnect, traffic between the on-premises network and the GPC VPC network never traverse the public internet. Only direct dedicated connections are used. One of the advantages of bypassing the public internet, is that traffic takes fewer hops, providing fewer points of failure where the traffic might get dropped or disrupted.

Dedicated Interconnect, Partner Interconnect, and two other GCP options, Direct Peering , and Carrier Peering can optimize egress (exit) traffic from a GCP VPC network. In addition, each of these can reduce egress costs. Cloud VPN by itself does not reduce egress costs.

Cloud Interconnect along with Private Google Access for on-premises hosts enables on-premises hosts to use internal IP addresses rather than external IP addresses to reach Google APIs and services.

Cloud Interconnect provides very low latency and high availability, but with higher overhead & cost. A lower cost alternative is to use Cloud VPN to set up IPsec VPN tunnels between an onprem network and GCP VPC. IPsec VPN tunnels use the public internet, but encrypt the traversing data by using industry-standard IPsec protocols.

Cloud Interconnect Reference

Use Case / Deployment Configuration

Universal application

Cloud VPN

GCP Cloud VPN securely connects an opnrem peer network to a Virtual Private Cloud (VPC) network through an IPsec VPN connection. Traffic traveling between the two networks is encrypted by one VPN gateway, and then decrypted by the other VPN gateway. This protects the data as it travels over the public internet. Another option is to connect two instances of Cloud VPN to each other.

GCP offers two types of Cloud VPN gateways:

Classic VPN &
HA VPN

Classic VPN gateways utilize:

single interface
single external IP address, &
support tunnels using dynamic (BGP) or static routing (route based or policy based)

GCP Classic VPN‘s provide a SLA of 99.99% service availability.

When referenced in GCP API documentation or in gcloud commands, Classic VPNs are referred to as target VPN gateways.

Within GCP, major functionality for Classic VPN is being deprecating on October 31, 2021

The more recent offering from GCP in this area, is the HA VPN.

HA VPN is a high availability (HA) Cloud VPN solution that securely connects, in a single region, an on-prem network to a Virtual Private Cloud (VPC) network, through an IPsec VPN connection. HA VPN provides an SLA of 99.99% service availability, like the Classic VPN.

Each HA VPN gateway interfaces supports multiple tunnels. The administrator can also create multiple HA VPN gateways. It is possible to configure a HA VPN gateway with only one active interface and one public IP address, but this configuration does not provide a 99.99% service availability SLA.

In the GCP API documentation and in gcloud commands, HA VPN gateways are always referred to as VPN gateways rather than target VPN gateways (reserved for Classic VPN’s) . Forwarding rules for HA VPN gateways do not need to be manually created for HA VPN‘s, but do need to be created for Classic VPN‘s.

Cloud VPN Reference

Use Case / Deployment Configuration

Universal application

Cloud Bigtable

Cloud Bigtable

GCP Cloud Bigtable is a fully managed, scalable NoSQL database service targeted to service large analytical and operational workloads

The most appropriate data use cases for Cloud Bigtable are next :

Time-series data, such as CPU and memory usage over time for multiple servers.
Marketing data, such as purchase histories and customer preferences.
Financial data, such as transaction histories, stock prices, and currency exchange rates.
Internet of Things data, such as usage reports from energy meters and home appliances.
Graph data, such as information about how users are connected to one another.

Cloud Bigtable stores data in massively scalable but sparsely populated tables. Each table is a sorted key/value map. These tables can scale to billions of rows and thousands of columns, enabling the storing of terabytes or even petabytes of data. A single value in each row is indexed – this value is known as the row key. Cloud Bigtable is ideal for storing very large amounts of single-keyed data with very low latency. It supports high read and write throughput at low latency, and it is an ideal data source for:

MapReduce operations
stream processing/analytics
machine learning applications

Each row/column intersection can contain multiple cells, or versions, at different timestamps, providing a record of how the stored data has been altered over time. Cloud Bigtable tables are sparse – if a cell does not contain any data, it does not take up any space.

Cloud Bigtable is accessed by applications via multiple client libraries, including a supported extension to the Apache HBase library for Java . Consequently, it integrates well with the existing Apache ecosystem of open-source Big Data software.

Cloud Bigtable has a long history with GCP, and is built on proven infrastructure that powers a number of Google products, including Search & Maps.

The Microsoft equivalent of GCP’s Cloud Bigtable is the Azure Cosmos database,

Cloud Bigtable Applications

Use Case / Deployment Configuration

Energy / Oil and Gas

Healthcare

Retail & eCommerce / Beacons and Targeted Marketing

Database / Gaming Backend Database

Big Data / Time-series Analysis

*SCADA based, deploying Cloud IoT, Pub/Sub, Dataflow, Big Table, Data Studio

Business side uses Cloud SQL, Storage, ML Engine, Datalab

*Patient data via mobile device to Cloud Pub/Sub, to BigTable.

Adv analytics on stored data via Prediction API or Tensor Flow. Notifications.

*Beacon is a proximity notification. Uses Dataflow, Pub/Sub & BigTable

*Use Google Cloud Spanner for match history, Cloud Bigtable to log events

**OpenTSDB time series database engine on GKE to NoSQL db Cloud BigTable

Cloud Datastore

Cloud Datastore

GCP Cloud Datastore is a highly scalable NoSQL database. Cloud Datastore automatically handles sharding and replication, delivering a highly available and durable database that scales automatically to handle an application’s expanding load. Despite being a NoSQL database, Datastore provides a number of familiar capabilities, including;

ACID transactions,
SQL-like queries,
indexes

Cloud Datastore utilizes a RESTful interface. Cloud Datastore can be used as the integration point for solutions that span across App Engine and Compute Engine.

Given that Cloud Datastore is a schemaless database, less emphasis needs to be placed on managing the underlying data structure as an application evolves. In addition, the query language is very simple.

Datastore supports a variety of data types, including:

integers
floating-point numbers
strings
dates &
binary data

The supported programming languages include;

.Net
Go
Java
JavaScript (Node.js)PHP
Python
Ruby

Cloud Datastore users are being encouraged to migrate to Cloud Filestore, which became available on GCP in 2017.

Cloud Datastore Applications

Use Case / Deployment Configuration

Serverless / Platform Services on App Engine

*Gaming on GCP, via RESTful HTTP endpoints. Cloud Datastore:Memcache front end, provides NoSQL db. GKE:Agones:OpenMatch to autoscale server resources.

Cloud Filesstore Applications

Use Case / Deployment Configuration

AppDev & Serverless / Serverless Web Scraping w/ Cloud Functions

*Event-driven web scraping w/ Cloud Functions, Firestore & Scheduler. Built-in support for Headless Chrome, providing sophisticated UI testing & web scraping.

Cloud Spanner

GCP Cloud Spanner

Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at world-wide scale, and:

schemas
ACID transactions
SQL compatibillity (ANSI 2011 with extensions), &
automatic, synchronous replication for high availability

Traditionally, when building cloud applications, dba’s & developers have been mandated to chose either ‘A’ or ‘B’:

traditional relational databases that guarantee transactional (ACID) consistency, or
NoSQL databases that offer easy horizontal scaling and data distribution.

Cloud Spanner is very unique, because it offers both of these critical requirements in a single, integrated, fully managed service. In addition, it functions as a ‘Database as a Service (DBaaS)’ offering.

Cloud Spanner keeps application development familiar to traditional relational DBA’s, by supporting standard tools and languages common to the traditional relational database environment. It’s an excellent solution for operational workloads supported by traditional relational databases, including;.

inventory management
financial transactions &
e-commerce

Cloud Spanner supports:

distributed transactions
schemas and DDL statements
SQL queries &
JDBC drivers

Cloud Spanner provides client libraries for the most common languages, including:

Java
Go
Python &
Node.js.

Cloud Spanner provides key benefits to DBAs already on the Cloud, or seeking to transition to the cloud, as follows;

Allows DBA’s to focus on application logic, by effectively abstracting the underlying hardware and software
Allows the scaling out of RDBMS solutions without complex sharding or clustering
Delivers horizontal scaling without migration from relational to NoSQL databases
Delivers high availability disaster recovery without requiring a complex replication and failover infrastructure

The rough Microsoft equivalent to Cloud Spanner, is Cosmos Database.

The rough AWS equivalent to Cloud Spanner, is Amazon AWS Dynamo Database.

Cloud Spanner Applications

Use Case / Deployment Configuration

Migrations /

Oracle to Cloud Spanner

AWS DynamoDB to Spanner

Database /

Oracle to Cloud Spanner

Gaming Backend Database

*Oracle db to CSV files to GCP’s Cloud Dataflow ETL & GCP’s Cloud Spanner

*AWS Dynamo DB migrated to GCP’s Cloud Spanner

*Oracle db via CSV files to GCP’s Cloud Dataflow ETL & GCP’s Cloud Spanner

*Use Google Cloud Spanner for match history, Cloud Bigtable to log events

Cloud SQL

Cloud SQL

GCP Cloud SQL is a fully managed relational database service for MySQL, PostgreSQL, and SQL Server. Specifically, these tradtional relational databases can be seemlessly installed on Cloud SQL.

Also, like Cloud Spanner, it is also DbaaS – Database as a Service.

For each of the above traditional relational databases, replication and backups are easily configured to protect data. Automatic failover, to create a High Availability(HA) solution, is also easily setup . Cloud SQL data is automatically encrypted, and Cloud SQL is fully compliant for:

SSAE 16
ISO 27001
PCI DSS &
HIPAA

Connections to the relational databases running on Cloud SQL, are possible from;

App Engine
Compute Engine
Google Kubernetes Engine, &
client workstations.

Lastly, sophisticated analytics can be obtained, by using BigQuery to directly query Cloud SQL based databases.

The table below provides a summary of the most important use case/deployment configuration for each of these options.

Cloud SQL Applications

Use Case / Deployment Configuration

To Cloud SQL solution, for :

Energy / Oil and Gas

Big Data / Real-Time Inventory

Retail & eCommerce / Real-Time Inventory

MySQL, PostgreSQL, SQL Server

*SCADA based, deploying Cloud IoT, Pub/Sub, Dataflow, Big Table, Data Studio

Business side uses Cloud SQL, Storage, ML Engine, Datalab

*Back Office Biz Apps to App Engine & Cloud SQL

*Back Office Biz Apps to App Engine & Cloud SQL via Cloud Pub/Sub

Cloud Storage

Cloud Storage

GCP Cloud Storage facilitates word-wide storage and retrieval of unlimited amounts of data, at virtually any time. Cloud storage can be deployed for a wide range of scenarios, including but not limited to:

serving web content
storing data for disaster recovery or archival
distributing to users large data objects

GCP ‘cloud storage application niches’, are commonly divided into eight categories, as follows;

Object or Blob Storage : Cloud storage
Block Storage : Persistent disk
Block Storage : Local SSD
Archival Storage : Cloud Storage
File Storage : Filestore
Mobile Application : Cloud Storage for Firebase
Data Transfer : Data Transfer Services
Collaboration : Google Workplace Essentials

Each of these, along with their ‘best use’ application, will be discussed below;

Object or Blob Storage : Cloud storage {global edge-caching and instant data access}

Stream videos
Image and web asset libraries, & construct
Data lakesBlock Storage : Persistent disk

2.Block Storage : Persistent disk {virtual machines & containers}

Disks for virtual machines
Sharing read-only data across multiple virtual machines
Rapid, durable backups of running virtual machines
Storage for databases

3.Block Storage : Local SSD

{ Ephemeral locally-attached block storage for virtual machines & containers }

Flash-optimized databases
Hot caching layer for analytics
Application scratch disk

4.Archival Storage : Cloud Storage

{ archival storage with high online access speeds }

Backups
Media archives
Long-tail content
Data with compliance requirements

5.File Storage : Filestore

{ scalable file storage w/ defined performance parameters }

Data analytics
Rendering and media processing
Application migrations
Web content management

6.Mobile Application : Cloud Storage for Firebase

{Scalable storage for user-generated content from Firebase }

User-generated content
Uploads over mobile networks

7.Data Transfer : Data Transfer Services

{offline, online, or cloud-to-cloud data transfer}

Move ML/AI training datasets
Migrate from S3 to Google Cloud

8.Collaboration : Google Workplace Essentials

{ Cloud-based content collaboration and storage }

Access files from any location via web, apps, & sync clients
Create & work on docs with coleagues
Connect a team w/ secure video conferencing

Cloud Storage Applications

Use Case / Deployment Configuration

Energy / Oil and Gas

Serverless / Event Driven

Big Data / Data Lake

Data Warehouse / Data Lake

*SCADA based, deploying Cloud IoT, Pub/Sub, Dataflow, Big Table, Data Studio

Business side uses Cloud SQL, Storage, ML Engine, Datalab

*Event Source, to Cloud Pub/Sub, to Archiver Cloud Functions, to Cloud Storage Data Archive

*Cloud Storage

Big Query

GCP Big Query is an updated, modern enterprise data warehousing solution which supports massive datasets. In order to query these massive datasets cost effectively and expeditiously, specific hardware and infrastructure must be deployed. Big Query resolves this issue by being ‘serverless’.

BigQuery solves this problem by constructing and executing super-fast SQL queries.

BigQuery is fully-managed by GCP. Activation requires no actions to deploy resources, such as disks and virtual machines.

BigQuery access includes the following options:

Cloud Console
bq command-line tool
by making calls to BigQuery REST API

Client libraries including:

Java
.NET &
Python.

are useable with the REST API. Other third-party tools can be used to interact with Biq Query (not listed here).

Four application ‘flavors’ are commonly utilized with Big Query:

BigQuery ML
BigQuery GIS
Big Query BI engine
Connected Sheets

Big Query ML (Machine Learning)

BigQuery ML facilitates the construction and operation of ML models on structured or semi-structured data, by data scientists and data analysts. This is accomplished using basic SQL. Ten different model types are supported.

Big Query GIS

BigQuery GIS provides the serverless architecture of BigQuery with native support for geospatial analysis. This merges analytics workflows with geographic location intelligence. This application supports:

arbitrary points
lines
polygons, & multi-polygons

displayed in common geospatial data formats

BigQuery BI Engine

BigQuery BI Engine is fast in-memory analysis service that allows users to analyze large and complex datasets interactively with fractional-second query response time and high concurrency.

BigQuery Connected Sheets

Connected Sheets allows users to analyze billions of rows of live BigQuery data in Google Sheets without the application of SQL knowledge. Users can apply familiar tools—like pivot tables, charts, and formulas—to easily derive insights from big data.

The table below provides a summary of the most important use case/deployment configuration for each of these options.

BigQuery Applications

Use Case / Deployment Configuration

Healthcare / Genomics, Secondary Analysis

Variant Analysis

Radiological Image Extraction

Big Data / Data Warehouse Modernization

Log Processing

Data Warehouse, Retail & eCommerce / Shopping Cart Analysis

Financial Services / Time Series Analysis

Retail & eCommerce / PCI

Healthcare API Analytics

*Sequencers data to Ingest Server; metadata to Cloud SQL, raw data to GCS

Sequence to BAM files. Accessed via Jupyter notebooks, BigQuery analysis

*Genomics API using Big Data, to FASTQ or BAM. Private or shared datasets.

Batch analysis using Cloud Dataflow, interactive via Big Query & DataLab

*Cloud Healthcare API, Pub/Sub, Storage, to Cloud Dataflow , Dataproc to

BigQuery to Cloud DataLab

*DICOM API, to Imaging Analytics, to BigQuery, Cloud ML, Dataproc, DataLab

*BigQuery DW & ETL/ELT via Cloud Dataflow/Dataproc /Composer

*StackDriver to Dataflow to BigQuery

*Analyze customer behavior(heuristics) via Cloud Dataproc, Dataflow, detail analtyics via BigQuery

*BigQuery & DataLab

*Cloud Monitoring w/ StackDriver, Big Query & Cloud Logging. PCI = Payment Card Industry

Cloud Dataflow

Cloud Dataflow

GCP Cloud Dataflow is a fully managed service and serverless processing service for executing Apache Beam pipelines within the GCP ecosystem.

When a job executes on Cloud Dataflow, it spins up a cluster of virtual machines, distributes the tasks in the job to the VMs, and dynamically scales the cluster based on how the job is performing. Routinely, it changes the order of operations in the processing pipeline to optimize the job.

Apache Beam data processing jobs run on Cloud Dataflow may be either batch and streaming jobs. Tasks commonly performed to output results include:

Write a data processing program in Java using Apache Beam
Use different Beam transforms to map and aggregate data
Use windows, timestamps, and triggers to process streaming data
Deploy a Beam pipeline both locally and on Cloud Dataflow
Output data from Cloud Dataflow to Google BigQuery

The table below provides a summary of the most important use case/ deployment configuration for each of these options.

Migrations , Database / Oracle to Cloud Spanner

Healthcare / Variant Analysis

Healthcare API Analytics

Big Data /

Data Warehouse Modernization

Data Warehouse / Shopping Cart Analysis

Retail & eCommerce / Beacons and Targeted Marketing

Shopping Cart Analysis

Log Processing

*Oracle db to CSV files to GCP’s Cloud Dataflow ETL & GCP’s Cloud Spanner

*Genomics API using Big Data, to FASTQ or BAM. Private or shared datasets.

Batch analysis using Cloud Dataflow, interactive via Big Query & DataLab

*Cloud Healthcare API, Pub/Sub, Storage, to Cloud Dataflow , Dataproc to

BigQuery to Cloud DataLab

*BigQuery DW & ETL/ELT via Cloud Dataflow/Dataproc /Composer

*StackDriver to Dataflow to BigQuery

*Analyze customer behavior(heuristics) via Cloud Dataproc, Dataflow, detail analtyics via BigQuery

*Beacon is a proximity notification. Uses Dataflow, Pub/Sub & BigTable

*Analyze customer behavior(heuristics) via Cloud Dataproc, Dataflow, detail analtyics via BigQuery

Cloud Dataprep

GCP Cloud Dataprep, is an intelligent data preparation tool, used to visually explore, clean and prepare both structured and unstructured data for review, reporting, and machine learning. Given that Cloud Dataprep is serverless, regardless of the scale of the application, there is never any infrastructure to deploy or manage.

Cloud Dataprep is provided by a Google partner, Trifacta.

Cloud Dataprep is unique, given that each UI input by the user triggers a data transformation recommendation. This abrogates the need to write code.

Cloud Dataprep automatically detects:

schemas
data types
possible joins, &
anomalies

such as:

missing values,
outliers, &
duplicates

This real time feedback allows the user to minimize the time-consuming work of assessing data quality, and instead focus on exploration and analysis. Specifically, the feedback generated by Cloud Dataprep is based on a proprietary inference algorithm to interpret the data transformation intent of a user’s data selection.

The user does have major final input into the nature and sequence of data transformations.

Once the user has defined the nature and sequence of data transformations, Cloud Dataprep calls Cloud Dataflow behind the scenes, triggering the processing of structured or unstructured datasets in Cloud Dataflow.

The table below provides a summary of the most important use case/deployment configuration for each of these options.

Cloud Dataprep Applications

Use Case / Deployment Configuration

Cloud Dataprep is serverless

*Cloud Dataprep calls Cloud Dataflow to trigger the processing of data

Cloud Dataproc

GCP Dataproc is a managed Spark and Hadoop service that provides open source data tools for:

batch processing
querying
streaming, &
machine learning

Dataproc automation facilitates the quick creation of clusters, and management of them. Dataproc saves money by turning clusters off when they are no longer needed.

Dataproc provides built-in integration with other GCP services, including;

BigQuery
Cloud Storage
Cloud Bigtable
Cloud Logging &
Cloud Monitoring

This integration delivers a complete data platform to support your Spark or Hadoop cluster.

Dataproc supports the following open source applications;

Hadoop
Spark
Hive &
Pig

Dataproc can be accessed in these ways, via:

REST API
Cloud SDK
Dataproc UI
Cloud Client Libraries

The table below provides a summary of the most important use case/deployment configurations for each of these options.

Cloud Dataproc Applications

Use Case / Deployment Configuration

Healthcare /

Healthcare API Analytics

Radiological Image Extraction

Big Data / Data Warehouse Modernization

Data Warehouse / Shopping Cart Analysis

AI & ML / Recommendation Engines

Retail & eCommerce / Fraud Detection,

Shopping Cart Analysis

Financial Services / Monte Carlo Simulations,

Fraud Detection /

*Cloud Healthcare API, Pub/Sub, Storage, to Cloud Dataflow , Dataproc to

BigQuery to Cloud DataLab

DICOM API, to Imaging Analytics, to BigQuery, Cloud ML, Dataproc, DataLab

*BigQuery DW & ETL/ELT via Cloud Dataflow/Dataproc /Composer

*Analyze customer behavior(heuristics) via Cloud Dataproc, Dataflow , detail analtyics via BigQuery

*GCP Prediction API to train regression /classification models & generate realtime predictions..OR Spark MLlib sourced custom machine learning algorithms, deployed to Cloud Dataproc

*Analyze customer behavior(heuristics) via Cloud Dataproc, Dataflow, detail analtyics via BigQuery

*Dataproc & Apache Spark provide infrastructure, capacity to run Monte Carlo simulations written in Java, Python, or Scala.

*GCP Prediction API to train regression /classification models & generate realtime predictions..OR Spark MLlib sourced custom machine learning algorithms, deployed to Cloud Dataproc

Cloud IoT Core

The GCP Cloud IoT Core utilizes the following concepts to function:

Internet of Things (IoT) : Any physical object connected to the internet (directly or indirectly) that has the ability to exchange data without user involvement.
Device:Any processing unit that is capable of connecting to the internet and exchanging data with the cloud. These devices are routinely referred to as “smart devices” or “connected devices.” These devices send two types of data:

telemetry &
state

Telemetry:Any and all event data sent from devices to the cloud. Event data sent commonly provides measurements about the local environment. Telemetry data can be analyzed by GCP Big Data solutions.
Device state: User defined aggregation of data, that describes the current status of the device. This data can be structured or unstructured, but always only flows from the device to the cloud, never in reverse.
Device configuration: User defined aggregation of data, commonly used to change a device’s settings. This data can be structured or unstructured, but always only flows from the cloud to the device, never in reverse.
Device registry: A container of devices with shared properties. A device is “registered” with a service (e.g. Cloud IoT Core) so that it may be managed by it.
Device manager: Service used to monitor device health and activity, update device configurations, & manage credentials & authentication.
MQTT(Message Queue Telemetry Transport): An industry-standard IoT protocol . MQTT is a publish/subscribe (pub/sub) messaging protocol.
Cloud IoT Core Components:   Device manager & protocol bridges.

Two protocol bridges are available for devices to connect to GCP:

MQTT
HTPP

Cloud IoT Core is the GCP fully managed service, utilizing all of the above components, to readily and securely connect, manage, and ingest data from globally dispersed devices.

The table below provides a summary of the most important use case/deployment configuration for each of these options.

Cloud IoT Core Applications

Use Case / Deployment Configuration

IoT /

IoT Remote Monitoring

IoT MQTT Bridge

Smart Home Devices

Cloud to Edge ML

AI & ML / Chatbot with Dialogflow

*Cloud IoT Core to connect IoT devices via MQTT or HTTP bridge to GCP.

*Devices of any size may connect thru secured, bidirectional MQTT bridge.

*Smart Home actions controls IoT devices thru Google Assistant. MQTT or HTTP bridge(s) connect IoT devices to GCP using per-device public/private key auth.

*Cloud IoT Edge extends GCP data processing and machine learning to gateways, cameras, and other connected devices.

*Dialogflow is an end-to-end, create-once, and deploy-anywhere development suite for creating conversational interfaces for websites/mobile apps/messaging platforms, & IoT devices

Cloud Pub-Sub

GCP Pub/Sub is an asynchronous messaging service that decouples production events from processing events, for all of the services that are involved.

Pub/Sub offers durable and real time message storage and delivery with high availability and reliable performance at scale. Of course, Pub/Sub servers run in of the GCP regions around the world.

Core concepts of Pub/Sub
Topic: A named resource to which messages are sent by publishers.
Subscription: A named resource representing the total stream of messages from a single, specific topic, to be delivered to a subscribing application.
Message: The combination of data and (optional) attributes that a publisher sends to a topic and is eventually delivered to subscribers.
Message attribute: A key-value pair that a publisher can define for a message.
Publisher-subscriber relationships
A publisher application creates and sends messages to a topic. Subscriber applications create a subscription to a topic to receive messages from it. Communication can be;

one-to-many (fan-out)
many-to-one (fan-in) &
many-to-many

Common use cases for Pub/Sub

Balancing workloads in network clusters
Implementing asynchronous workflows
Distributing event notifications
Refreshing distributed caches
Logging to multiple systems
Data streaming from various processes or device
Reliability improvement

Common GCP services on the ‘send’ side of Pub/Sub include;

Cloud Logs
Cloud API
Cloud Dataflow
Cloud Storage
Compute Engine

Common GCP services on the ‘receive’ side of Pub/Sub include;

Cloud Networking
Compute Engine
Cloud Dataflow
App Engine
Cloud Monitoring

The table below provides a summary of the most important use case for each of these options.

Cloud Pub/Sub Applications

Use Case / Deployment Configuration

Energy / Oil and Gas

Healthcare / Patient Monitoring

Healthcare API Analytics

Healthcare API ML

Serverless / Event Driven

Retail & eCommerce / Real-Time Inventory

Beacons and Targeted Marketing

*SCADA based, deploying Cloud IoT, Pub/Sub, Dataflow, Big Table, Data Studio Business side uses Cloud SQL, Storage, ML Engine, Datalab

*Patient data via mobile device to Cloud Pub/Sub, to BigTable. Adv analytics on stored data via Prediction API or Tensor Flow. Notifications.

*Cloud Healthcare API, Pub/Sub, Storage, to Cloud Dataflow , Dataproc to BigQuery to Cloud DataLab

*Machine Learning, to Cloud Pub/Sub, to ML models, to Enterprise Viewer

*Event Source, to Cloud Pub/Sub, to Archiver Cloud Functions, to Cloud Storage Data Archive

*Back Office Biz Apps to App Engine & Cloud SQL via Cloud Pub/Sub

*Beacon is a proximity notification. Uses Dataflow, Pub/Sub & BigTable

Cloud IAM

GCP IAM allows an administrator to grant granular access to specific GCP resources and simultaneously prevents access to other resources. As a baseline, IAM adopts the security principle of least privilege, in which only the minimum and necessary permissions are granted to access specific resources.

How IAM functions
Via IAM, the administrator manages access control by defining

who (identity) has
what access (role) for
which resource

The organizations, folders, and projects that used to organize relevant resources are also resources.

Via IAM, permission to access any given resource are never granted directly to the end user. Instead;

permissions are grouped into roles, &
roles are granted to authenticated members

An IAM policy defines and enforces:

what roles are granted to
which members, &
this policy is attached to a resource

When an authenticated member attempts to access a resource, IAM checks the resource’s policy to determine whether the action is permitted.

IAM access management has three main parts, Member, Role & Policy, defined as follow;:

*Member. A member can be a:

Google Account (for end users), a
service account (for apps and virtual machines), a
Google group, or a
Google Workspace or Cloud Identity domain

that can access a resource.

The identity of a member is an email address associated with a

user,
service account, or
Google group; or a
domain name associated with Google Workspace or Cloud Identity domains.

*Role. A role is a collection of permissions. Permissions determine what operations are allowed on a resource. When administrators grant a role to a member, they grant all the permissions that the role contains.

*Policy. An IAM policy binds one or more members to a role. When an administrator seeks to define;

who (which member) has
what type of access (role)

on a resource, the administrator creates a:

policy &
attaches it to the resource.

The table below provides a summary of the most important use case for each of these options.

Cloud IAM Applications

Use Case / Deployment Configuration

Cloud IAM (Identity & Access Management) is a universal security mechanism across GCP.

*GCP Cloud IAM enforces security access by defining Members, Roles, & Policies.

Cloud Endpoints

GCP Cloud Endpoints provides a mechanism for an administrator to develop, deploy, protect, and monitor APIs . Cloud Endpoints is a distributed API management system, based on OpenAPI v2, defined for Rest API’s. It comprises:

services
runtimes, &
tools

Cloud Endpoints provides;

management
monitoring, &
authentication, along with
high performance

The developer effectively ‘offloads’ responsibility for delivering these to this distributed API management system, under the province of GCP.

The components that make up Cloud Endpoints are:

Extensible Service Proxy (ESP) or Extensible Service Proxy V2 Beta (ESPv2 Beta) – for injecting Cloud Endpoints functionality.
Service Control – for applying API management rules
Service Management – for configuring API management rules
Cloud SDK – for deploying and management
Google Cloud Console – for logging, monitoring and sharing.

Cloud Endpoints is an NGINX-based proxy and distributed architecture, which uses an OpenAPI Specification.

Cloud Endpoints links with Cloud Monitoring, Cloud Logging, and Cloud Trace to provide operational insights. Data can, of course, be transferred to Big Query for further analysis.

API access and validation is accomplished via JSON Web Tokens along with Google API Keys. The identity of users of the web or mobile application is obtained via Auth0 and Firebase Authentication.

The table below provides a summary of the most important use case/deployment configuration for each of these options.

Application

Use Case / Deployment Configuration

Cloud Endpoints is a distributed API management system, and can be deployed on GCP anytime an OpenAPI v2 solution, defined for Rest API’s, is sought.

VPC

**GCP VPC provides virtualized networking functionality**

A VPC network on GCP, can roughly be viewed the same way as a physical network, except that it is a virtualized solution within GCP. A VPC network is a global resource that consists of a:

list of regional virtual subnetworks (subnets) in data centers,
all connected by a global wide area network.

VPC networks are logically isolated from each other in GCP.

All new GPC projects start with a default network (an auto mode VPC network) that has one subnetwork (subnet)specified in each region.

VPC‘s provides networking functionality to:

Compute Engine virtual machine (VM) instances,
Kubernetes Engine (GKE) clusters &
App Engine flexible environment, &
all other GCP products built on Compute Engine VM’s

In addition, a VPC network provides the following:

Native Internal TCP/UDP Load Balancing and proxy systems for Internal HTTP(S) Load Balancing.
Connections to on-premises networks using Cloud VPN tunnels and Cloud Interconnect attachments.
Traffic distribution from GCP external load balancers to backends.

Every VPC network implements a distributed virtual firewall that is configurable. Firewall rules acontrol which packets are allowed to travel to which destinations. Every VPC network operates with two implied firewall rules:

block all incoming connections, &
allow all outgoing connections

Defined routes specify how traffic is sent from an instance to a destination, either inside the network or outside of GCP. Each VPC network default configuration specifies some system generated routes to send traffic:

among its subnets &
from eligible instances to the internet

While routes govern traffic departing an instance, forwarding rules direct traffic to a GCP resource located in a VPC network based on:

IP address
protocol, &
port

The destinations for forwarding rules are:

target instances,
load balancer targets (target proxies, pools, & backend svcs), &
Cloud VPN gateways.

The table below provides a summary of the most important use case for each of these options.

Applications

Use Case / Deployment Configuration

GPC VPC’s provide virtualized networking functionality

VPC‘s provides networking functionality to:

Compute Engine virtual machine (VM) instances,
Kubernetes Engine (GKE) clusters &
App Engine flexible environment, &
all other GCP products built on Compute Engine VM’s

All new GPC projects start with a default network (an auto mode VPC network) that has one subnetwork (subnet) specified in each region.

Identity Aware Proxy

GCP IAP facilitates the construction of a central authorization layer for applications accessed by HTTPS. This approach replaces traditional reliance on network-level firewalls, with an application-level access control model.

IAP policies are designed to scale across an organization. IAP policies may be defined centrally and then applied across all applications and resources. By definition, IAP is used to enforce access control policies for all applications and resources.

When an application or resource is protected by IAP, it can only be accessed through the proxy by:

members , also known as users, who must have the
correct Identity and Access Management (IAM) role

A user granted access to an application or resource by IAP, is automatically subjected to the fine-grained access controls implemented by the application, without requiring a VPN. When a user tries to access a IAP-secured resource, IAP performs both

authentication &
authorization

checks.

Requests for GCP resources come through

App Engine or
Cloud Load Balancing (HTTPS)

The first step, involving authentication, requires that the serving infrastructure code for these products/applications determines if IAP is enabled for the app/backend service. If so, information about the protected resource is sent to the IAP authentication server. The information that may be included covers:

GCP project number,
request URL, & any
IAP credentials in request headers or cookies

For the next step, IAP checks the user’s browser credentials. If none exist, the user is:

redirected to an OAuth 2.0 Google Account sign-in flow that
stores a token in a browser cookie for future sign-ins.

Assuming valid request credentials, these are then used by the authentication server to get the user’s identity (email address & user ID). The authentication server then uses the identity to check the user’s IAM role and validate that the user is authorized to access the resource.

The next step after authentication is authorization. During this step, IAP applies the relevant IAM policy to confirm that the user is authorized to access the requested resource. This user authorization must take the form of the IAP-secured Web App User role on the Cloud Console project where the resource exists. If the user possesses this role, they’re authorized to access the application. Changes to the IAP-secured Web App User role list are executed via the IAP panel on the Cloud Console .

The table below provides a summary of the most important use case for each of these options.

IAP Application

Use Case / Deployment Configuration

Identity Aware Proxy (IaP) works with IAM to enforce access control policies for all apps & resources.

*The IAP-secured Web App User role is central here for authorization success.

KMS

**GCP KMS, ‘Key Management Service’, provides control over how data is encrypted at rest, and how encryption keys are managed.**

KMS stands for ‘Key Management Service’.

Data stored on GCP is encrypted at rest. Use of the Cloud Key Management Service (Cloud KMS) platform provides greater control over how:

data is encrypted at rest, & how the
encryption keys are managed.

The Cloud KMS platform allows GCP customers to manage cryptographic keys in a central cloud service for either:

direct use, or
use by other cloud resources & apps

There are two types of software based encryption keys:

customer-managed encryption keys (CMEK)
customer-supplied encryption keys (CSEK)

Cloud KMS provides the following options for key generation:

Cloud KMS software backend provides the flexibility to encrypt data with either a symmetric or asymmetric key that is directly controlled ( Cloud KMS ).
Hardware keys, are obtained via validated Hardware Security Modules (Cloud HSM ‘s) .
Cloud KMS provides for the import of customer generated cryptographic keys.
Another option is to use keys generated by Cloud KMS with other GCP services. These keys are referred to as customer-managed encryption keys (CMEK). This CMEK feature provides for the generation, use, rotation, and destruction of encryption keys deployed to help protect data in other GCP services.
Another facility, the Cloud External Key Manager (Cloud EKM) , provides for the creation and management of keys in a key manager located externally to GCP. The Cloud KMS platform may then use these external keys to protect data at rest.
Customer-managed encryption keys may be used with a Cloud EKM key.

GCP allows for the use of customer-supplied encryption keys (CSEK) for both:

Compute Engine &
Cloud Storage

Under this arrangement, data is decrypted and encrypted using a key that’s provided on an API call.

Cloud KMS provides the following functionality:

Customer control
Access control and monitoring
Regionalization limits & assignment
Durability (eleven 9’s)
Security

The table below provides a summary of the most important use case for each of these options.

KMS Application

Use Case / Deployment Configuration

KMS, ‘Key Management Service’, provides control over how data is encrypted at rest, and how encryption keys are managed.

*KMS supports two types of software based encryption keys:

customer-managed encryption keys (CMEK)
customer-supplied encryption keys (CSEK)

Data Loss Prevention

GCP ‘Data Loss Prevention’ (Cloud DLP) is a fully managed service designed to facilitate the:

discovery
classification, &
protection

of an organization’s most sensitive data.

Cloud DLP provides for:

Inspection of structured or unstructured data, to facilitate transformation
Reduction in data risk through data de-identification via masking and tokenization

Cloud DLP supports over 120 built-in information types, covering both structured and unstructured data.

Cloud DLP provides native support for scanning and classifying sensitive data in:

Cloud Storage
Cloud BigQuery
Cloud Datastore & a
streaming content API

The streaming content API provides support for additional data sources, custom workloads, and applications.

Cloud DLP provides tools to

classify
mask
tokenize, &
transform

all sensitive covered data elements.

Cloud Data Loss Prevention (DLP) API

The Cloud Data Loss Prevention (DLP) API Provides methods for detection of privacy-sensitive fragments in:

text
images, &
GCP storage repositories

The table below provides a summary of the most important use case for each of these options.

Data Loss Prevention (DLP) Application

Use Case / Deployment Configuration

Cloud Data Loss Prevention (DLP)provides for reduction in data risk, thru data de-identification.

*Cloud DLP provides native support for scanning and classifying sensitive data in:

Cloud Storage
Cloud BigQuery
Cloud Datastore & a
streaming content API

**GCP Machine Learning, now part of** **AI Platform**

Cloud ML

GCP Cloud ML has been subsumed by the GCP AI Platform, which provides access to the Cloud ML Engine.

The GCP AI End-to-end platform, targeted at data science and machine learning, facilitates the streamlining of ML workflows constructed by developers, data scientists, and data engineers.
The advanced AutoML option provides point-and-click workflow construction. State of the art applications may be developed via additional GCP tools:

TPUs
TensorFlow

The starting point is to prepare and store datasets is with BigQuery, then use the embedded Data Labeling Service to label project training data by applying:

classification
object detection, &
entity extraction

for:

images
videos
audio
text.

Model validation may be accomplished with AI Explanations. This application:

provides inputs into the model’s outputs
verifies the model behavior
discovers bias in the model
provides insights into ways to improve the model and related training data

Another, more sophisticated application, AI Platform Vizier, may be deployed as a ‘black box’ optimization service. This service will:

help tune hyperparameters
optimize the model’s output

Several tools are available for deployment optimization, including:

AI Platform Prediction
AutoML Vision Edge
TensorFlow Enterprise
MLOps

AI Platform Prediction that manages the infrastructure needed to deploye and run your model and makes it available for both online and batch prediction requests. You can also use

AutoML Vision Edge can be used to deploy models and trigger real-time actions based on local data.

TensorFlow Enterprise offers high-end support for a TensorFlow instance.

The MLOps application supports the development of models, experiment and workflows.

AI Platform Pipelines can support all of the above applications by facilitating the deployment of models, experiments, and workflows.

GCP applications and services, including:

Explainable AI
Notebooks
Vizier
TensorFlow Enterprise, &
Pipelines (Marketplace)

may be used at a $0.00 charge, but any GCP resources used to run them, including Compute and Storage, will incur a charge. GCP also provides integrated managed service packages, including:

Training
Prediction
Data Labeling Service
AutoML, &
BigQuery

The table below provides a summary of the most important use case/deployment configurations for each of these options.

Cloud ML/ AI Application

Use Case / Deployment Configuration

Healthcare includes: /

Healthcare API ML

Radiological Image Extraction

ML on EHR via Healthcare API

Cloud IoT Core includes: /

Cloud to Edge ML

AI & ML includes: /

Recommendation Engines

Chatbot with Dialogflow

TensorFlow on GPU

Low Latency ML Serving

Feature Embeddings

Semantic Similarity

Retail & eCommerce includes: /

Fraud Detection

Recommendation Engines

*Machine Learning, to Cloud Pub/Sub, to ML models, to Enterprise Viewer

*DICOM API, to Imaging Analytics, to BigQuery, Cloud ML, Dataproc, DataLab

*Machine learning and analytics using Cloud Healthcare API on GCP

*Cloud IoT Edge extends GCP data processing and machine learning to gateways, cameras, and other connected devices.

GCP Prediction API to train regression /classification models & generate realtime predictions..OR Spark MLlib sourced custom machine learning algorithms, deployed to Cloud Dataproc

Dialogflow is an end-to-end, create-once, and deploy-anywhere development suite for creating conversational interfaces for websites/mobile apps/messaging platforms, & IoT devices

TensorFlow training application on Graphics Processing Units (GPU) to accelerate training process for deep learning models.

Speeds availability of Machine Learning output

An embedding is a translation of a high-dimensional vector into a low-dimensional space

Explore similar articles via embeddings comparable to SQL queries.

GCP Prediction API to train regression /classification models & generate realtime predictions..OR Spark MLlib sourced custom machine learning algorithms, deployed to Cloud Dataproc

Spark MLlib sourced custom machine learning algorithms, deployed to Cloud Dataproc

Natural Language API

The GCP Natural Language API has multiple methods for performing analysis and annotation on text. Each level of analysis provides valuable information for language understanding. These five methods are:

Sentiment analysis
Entity analysis
Entity sentiment analysis
Syntactic analysis
Content classification

An overview of each of these is provided below;

Sentiment analysis inspects the given text and identifies the prevailing emotional opinion within the text, specifically to determine a writer’s attitude as positive, negative, or neutral.
Entity analysis inspects the given text for known entities (Proper nouns such as public figures, landmarks, and so on. A review of common nouns, such as house, car, etc, is also executed) . Entity analysis provides information about all of these entities.
Entity sentiment analysis inspects the given text for known entities (proper nouns and common nouns), returns information about those entities, and reveals the prevailing emotional opinion of the entity within the text. This analysis seeks to determine whether writer’s attitude toward the entity is positive, negative, or neutral.
Syntactic analysis extracts linguistic information, and breaks up the given text into a series of sentences and tokens (generally, word boundaries). It also provides further analysis on those tokens.
Content classification analyzes text content and generates a content category for the content.

Each API call also analyzes for, detects and returns the language, in the event a language is not specified by the caller in the initial request

The Natural Language API is a REST API, and consists of JSON requests and response.

The table below provides a summary of the most important use case for each of these options.

Natural Language API Application

Use Case / Deployment Configuration

Natural Language API applies five different methods for performing analysis and annotation on text.

*Five different methods

Sentiment analysis
Entity analysis
Entity sentiment analysis
Syntactic analysis
Content classification

Cloud Speech API

GCP Cloud Speech API deploys 3 main methods to perform speech recognition:

Synchronous Recognition
Asynchronous Recognition
Streaming Recognition

Each of these three methods is described briefly below. A distinction is made with regard to support for different formats of the incoming payload, REST vs qRPC. qRPC is much more precise.

Synchronous Recognition (supports both REST and gRPC)

This method sends received audio data to the Speech-to-Text API, which performs recognition on that data. It then returns results after all audio has been processed. Synchronous Recognition requests cannot exceed 60 seconds of duration.

Asynchronous Recognition (supports both REST and gRPC) Like the Streaming Recogniton method, this method sends audio data to the Speech-to-Text API. However, instead it initiates a Long Running Operation. This approach supports period polling for recognition results. Also, durations up to 28,800 seconds are supported.

Streaming Recognition (supports gRPC only) Executes perecognition on audio data provided within a gRPC bi-directional stream . This approach is designed primarily for real-time recognition purposes- e.g., capturing live audio from a microphone. Streaming recognition provides near real-time results while audio is being captured. This allows result to appear, for example, while a user is still speaking.

Recognition requests contain:

configuration parameters, along with
audio data

We will now review the simplest method for performing recognition on speech audio data – Speech-to-Text API recognition.

Speech-to-Text API recognition
Speech-to-Text can process up to 60 seconds of speech audio data sent in a synchronous request. Once Speech-to-Text processes and recognizes all of the audio, it returns a response.

Speech-to-Text usually processes audio faster than real-time. For example, 30 seconds of audio input will usually be processed in less than 15 seconds on average. However, a recognition request can take much longer if the audio quality is poor.

Machine Learning Models
Speech-to-Text can use one of several machine learning models to transcribe your audio file.

Audio transcription requests sent to Speech-to-Text should include information about the original source of the audio file. Providing this information allows the Speech-to-Text API to process an audio file by selecting a machine learning model trained to recognize speech audio from that designated source type.

Speech-to-Text is designed to use the following types of machine learning models for transcribing audio files.

Video
Phone Call
ASR Command & Search
ASR Default

Note that ASR stands for ‘automatic speech recognition’.

The table below provides a summary of the most important use case for each of these options.

Cloud Vision API

GCP Cloud Vision API provides two computer vision products that use machine learning to provide insight into images, with industry-leading prediction accuracy. These two products are:

Cloud Vision API
AutoML Vision Edge

GCP Cloud Vision API offers powerful pre-trained machine learning models through REST and RPC APIs. Labels are assigned to images in order to quickly classify them into millions of predefined categories. Tasks performed include;

Detection of objects and faces,
reading of printed and handwritten text, &
construction of metadata into an image catalog

This application also supports detection and classification of multiple objects including the location of each object within the image.

Vision API uses OCR to detect text within images in more than 50 languages and various file types. It’s also a subcomponet of Document Understanding AI , which lets you process millions of documents quickly and automate business workflows.

Vision API’s vision product search capabiity allows retailers to create an sophisticated mobile experience that enables a customers to upload a photo of an item and immediately see a list of related items for purchase from the vendor.

Vision API can can allow a vendor to review images using Safe Search , which provides real time insight into the likelihood that any given image includes adult content, violence, or other objectionable content.

AutoML Vision Edge is used to build and deploy fast, high-accuracy models to classify images or detect objects at the edge. It also triggers real-time actions based on local data. AutoML Vision Edge supports a variety of edge devices where resources are constrained and latency is critical.

AutoML Vision vs Vision API : What they both can do

These two applicatons provide similar capabilities in these areas:

Use REST and RPC APIs.
Deploy low-latency, high accuracy models optimized for edge devices (Vision API (Integrate w/ ML Kit))
Detect objects, where they are, and how many.

The table below provides a summary of the most important use case for each of these options.

Cloud Vision API Application

Use Case / Deployment Configuration

Cloud Translate API

The GCP Cloud Translate API facilitates dynamic translation between languages using GCP’s pre-trained or custom machine learning models. We will investigate three different capability levels associated with the Cloud Translate API;

AutoML Translation
Translation API Basic
Translation API Advanced

AutoML Translation

Users can upload translated image pairs, and the AutoML Translation application will train a custom model that can be adapted to meet domain-specific needs. This built-in training capability allow developers and translators, who are unfamiliar with machine learning solutions, to construct high quality, production-ready models.

AutoML Translation outputs custom model results in more than fifty language pairs.

Translation API Basic
This application automatically and immediately translates texts into more than one hundred languages for recipient website and apps.

Translation API Advanced

This application offers the same instantaneous results provided with Basic, but provides additional customization features. Customization is especially relevant for domain- and context-specific terms or phrases.

Both versions of the Translation API pre-trained model supports more than one hundred languages, from Armenian to Zulu.

Translation API provides an easy-to-use Google REST API , that makes unecessary the extraction of text from the document. The source HTML can be inputted to the application, and the translated text automatically returned.

In the even the source language is unknown— for instance, in user-generated content that doesn’t include a language code —both translation products automatically identify languages with high accuracy.

The table below provides a summary of the most important use case for each of these options.

Cloud Translate API Application

Use Case / Deployment Configuration

For a continued review of the Google Cloud Platform, relating to ‘Solution Scenarios’, go here:

Google Cloud-1

We WILL deliver the solution that you need !

As a first step, we will be delighted to answer any and all of your questions !

Contact Us Today !

Contact-Us

Making the Cloud Reality

Google Cloud-2

Google Cloud

Specific Tools & Applications

Google Cloud Compute Services

Google Cloud Compute Services consists of four components;

Each of these abstracts a different part of the solutions architecture, as follows;

Compute Engine

Kubernetes Engine

App Engine

Cloud Functions

Stackdrive is now Operations

Cloud Logging

Cloud Debugger

Cloud Monitoring

Cloud Load Balancing

Cloud CDN

Cloud DNS

Firewall Rules

Cloud Interconnect

Cloud VPN

Cloud Bigtable

Cloud Datastore

Cloud SQL

Cloud Storage

Big Query

Big Query

Cloud Dataflow

Cloud Data Loss Prevention (DLP) API

We WILL deliver the solution that you need !

Contact Us Today !