PCF

PCF (PIVOTAL CLOUD FOUNDRY)

Pivotal Cloud Foundry (PCF) is a multi-cloud platform for the deployment, management, and continuous delivery of applications, containers, and functions. ... Major cloud platforms such as Amazon Web Services and Google Cloud also provide templates and quickstarts that automate large portions of the PCF deployment process.

Pivotal Cloud Foundry abstracts away the process of setting up and managing an application runtime environment so that developers can focus solely on their applications and associated data. Running a single command—cf push—will create a scalable environment for your application in seconds, which might otherwise take hours to spin up manually. PCF allows developers to deploy and deliver software quickly, without needing to manage the underlying infrastructure.

Cloud foundry 

Unlike most other Cloud Computing platform services, which are tied to particular cloud providers, Cloud Foundry is available as a stand-alone software package. You can, of course, deploy it on Amazon’s AWS, but you can also host it yourself on your own OpenStack server, or through HP’s Helion or VMware’s vSphere.

First of all though, to be completely clear, just what is a Cloud Computing platform? There are, broadly speaking, three major categories of Cloud Computing:

Infrastructure as a Service (IaaS), which provides only a base infrastructure, leaving the end-user responsible for platform and environment configuration necessary to deploy applications. Amazon Web Services and Microsoft Azure are prime examples of IaaS.

Software as a Service (SaaS) like Gmail or Salesforce.com.

Platform as a Service (PaaS), which helps to reduce the development overhead (environment configuration) by providing a ready-to-use platform. PaaS services can be hosted on top of the infrastructure provided by IaaS.

The three categories of cloud computing platforms: IaaS, PaaS, and SaaS


             The three categories of cloud computing platforms: IaaS, PaaS, and SaaS


Since it’s easy to become a bit confused when thinking about cloud platforms, it’s important to be able to visualize exactly which elements of the computing ecosystem are whose responsibilities. While there is no precise definition, it’s reasonable to say that a platform requires only that you take care of your applications.

With that in mind, the platform layer should be able to provide:

What is Cloud Foundry?

Cloud Foundry is an open-source cloud computing platform originally developed in-house at VMware. It is now owned by Pivotal Software, which is a joint venture made up of VMware,  EMC, and General Electric.

Cloud Foundry is optimized to deliver…

Not only can Cloud Foundry lighten developer workloads but, since Cloud Foundry handles so much of an application’s resource management, it can also greatly reduce the overhead burden on your operations team.

Cloud Foundry’s architectural structure includes components and a high-enough level of interoperability to permit…

Integration with development tools.

Although Cloud Foundry supports many languages and frameworks, including Java, js, Go, PHP, Python, and Ruby, not all applications will make a good fit. As with all modern software applications, your project should attempt to follow the Twelve-Factor App standards.


Key benefits of Cloud Foundry:

getting started with Pivotal Cloud Foundry

Pivotal Cloud Foundry creates two kinds of virtual machines that handle different aspects of an elastic runtime environment:

Component VMs create the underlying infrastructure for a deployment

Host VMs provide a generic runtime environment for applications


Component VMs allow PCF to be cloud-agnostic by providing a standardized infrastructure environment for staging and running applications on any supported cloud hosting service—including AWS, Google, Azure, OpenStack, and vSphere. The generic runtime provided by host VMs enables a single PCF deployment to host multiple applications regardless of language or dependencies. Once an application is pushed via cf push, PCF stages and packages it into a binary droplet that can be distributed and executed on a host VM.

In addition to abstraction, PCF provides significant scaling capabilities. Global-scale companies like Boeing, Wells Fargo, and Citi have adopted and used PCF to migrate and build applications in an elastic, cloud-based environment. A deployment can run any number of applications as long as there are sufficient underlying infrastructure resources. Hundreds of thousands of application instances can be spread across over a thousand host VMs. To increase availability and redundancy, PCF provides a single command to scale up the number of application instances and, if your environment is deployed across multiple availability zones, PCF will automatically spread instances across zones. Component VMs can likewise be scaled up and assigned to multiple zones to ensure high availability.

Monitoring Pivotal Cloud Foundry

Much of the information in this guide is targeted toward PCF operators, which is the term Pivotal applies to people managing and monitoring a PCF deployment (as opposed to developers or end-users, who deploy or access applications running on the deployment). For operators, monitoring your deployment’s performance and capacity indicators is vital to ensuring that applications are running optimally. It’s also necessary to check that the deployment can scale to match demand, whether scaling horizontally in terms of the number of application instances or vertically in terms of the resources available within the VMs. We’ll cover the key metrics operators will want to collect and monitor in PCF.

Additionally, Pivotal Cloud Foundry offers several ways to monitor application performance. In parts three and four we will also look at how developers who are deploying to PCF can collect monitoring data from their applications.

Before diving into monitoring a PCF deployment, it’s important to understand its pieces and how they work. 


Key components of Pivotal Cloud Foundry architecture

PCF is a distributed system comprising many components that run, manage, and monitor the health of the deployment and its application runtime environment, which might be hosting dozens or hundreds of applications. The primary components we will cover in this guide, along with their subsystems, are:

BOSH and the Ops Manager

BOSH is a deployment manager that can automatically provision and deploy widely distributed, cloud-based software. Originally developed specifically for Cloud Foundry, BOSH can also be used outside of Cloud Foundry environments, for example, to deploy a ZooKeeper or Kubernetes cluster.

Essentially, BOSH is what allows PCF to be deployed in any cloud by providing an interface to build required infrastructure components on top of a given IaaS platform. BOSH handles the deployment of the underlying PCF infrastructure by launching and managing all required component VMs via the BOSH Director.

Through common plugins such as the Health Monitor, BOSH can track the health of its VMs and also self-heal if it detects that a VM has crashed or has otherwise become inaccessible. It will attempt to recreate the faulty VM automatically to avoid downtime.

The BOSH Director reads YAML deployment manifests to determine what VMs, persistent disks, and other resources are required, as well as how many availability zones to use. For a given cloud provider, the BOSH Director relies on an IaaS-specific manifest, or cloud-config, that is applied on top of a baseline deployment config manifest for PCF. The cloud-config maps PCF resources to specific resources for a given cloud provider, for instance, mapping PCF availability zone z1 to AWS availability zone us-east-1a or Google Cloud zone us-central1-f.

The Director launches VMs built on stemcells, which include a base operating system, a BOSH agent for monitoring, and any required utilities and configuration files. Releases, which are layered on top of the stemcell, contain a versioned set of configurations, source code, binaries, scripts, and anything else that might be needed to run a specific software package on the VM.

Ops Manager

PCF wraps the BOSH API with the Ops Manager, which provides a web-based GUI to automate tasks and help administer the deployment. The Ops Manager generates manifest files from the options an operator has selected and sends them to the BOSH Director. The Ops Manager also provides various extensions and services for PCF that are packaged as tiles, which an operator imports and configures before deploying to the cluster. These services include monitoring solutions, several of which we will discuss in detail in part three of this series.

Within a PCF deployment, operators can also use the Ops Manager to install their desired application runtime environments (Linux or Windows) on host VMs. BOSH releases are available as build packs, which can be installed to provide runtime support for applications and services.

User Account and Authentication

The User Account and Authentication (UAA) server is Cloud Foundry’s central identity and access management (IAM) component. PCF operators can use the UAA API to create and manage user accounts. UAA also acts as an OAuth2 server and can generate authentication tokens for client applications based on the access scope they’ve been granted.



                                                


                                                A Sample Architecture of Pivotal Cloud Foundry 

Gorouter

The Gorouter is PCF’s router (written in Go). It handles and routes incoming requests. These requests come either from operators or developers sending commands—such as cf push—that are then routed to the Cloud Controller API, or from end users accessing application instances running on the deployment. The Gorouter communicates with the Diego BBS (explained below) to keep track of which applications exist, how many instances there are, and where they are running to maintain a routing table and shuttle user traffic appropriately. It provides basic round-robin load balancing for sending traffic to available application instances.

The Gorouter only supports requests over HTTP/HTTPS. If you use TLS encryption and want it to terminate as close to the instance as possible (as opposed to terminating at a load balancer) you may optionally enable and configure TCP routing. In this case, a separate TCP router handles routing for TCP traffic.

Cloud Controller

Pivotal Cloud Foundry’s Cloud Controller (CC) provides API endpoints for operators and developers to interact with PCF and the applications running on a specific domain. Commands include staging applications, starting or stopping applications, collecting health information, and querying that all desired applications are running. The CC is also responsible for communicating with the User Account and Authentication server to authenticate the user and ensure that they have the proper permissions to perform the requested task.

Developers can stage applications that are buildpack based or that are Docker images. Buildpack-based applications require one or more buildpacks to provide dependencies. For example, the Python build pack is needed to stage a Django application. Docker images don’t require build packs because the image contains everything required to run the application. The staging process differs somewhat between these two types of applications. We will cover both options below.

Buildpack Applications

When a developer pushes an application for staging, the CC sends needed files and instructions to PCF’s container orchestration and management component, called Diego. For staging and running applications, the CC requires a MySQL database and a blob store.

The CC creates a record and stores application metadata in the database including the application name; applicable orgs, spaces, services, and user roles; the number of instances to spin up; memory and disk quotas; etc. It also will create and bind a route to the new application.

The CC packages and stores required binary files—generated at various points in the staging process—in the blob store. These binaries consist of five types: application packages (source code, resource files, etc.), build packs (e.g., application dependencies), a resource cache (larger files from the application package), a build pack cache (larger files created during application staging), and droplets (the fully staged and ready-to-run application).

The CC sends instructions to Diego to distribute and execute the staging tasks, which take everything the application needs and package it into a droplet that can be run on a Cloud Foundry container. When complete, Diego sends the staged application droplet for storage in the CC’s blob store and notifies the CC that the application is ready. Finally, after staging, the Cloud Controller signals Diego to start the application and continues to communicate with Diego for updates on the application’s status.

Once an application is running, the CC makes it possible to bind services to the application. Services provision reserved resources for an application on demand. They can provide a wide range of types of resources. A few examples might be a web application account, a set of environment variables, or a dedicated Redis cluster.

Docker image applications

The main difference with how the CC stages Docker image-based applications is that build packs are not applied because the image itself provides the dependences. The CC sends the image to Diego for staging, receives and stores required metadata about the image from Diego, and then instructs Diego to schedule processes to run the application.

Diego

Diego is the container orchestration system for Cloud Foundry deployments, having replaced the previous DEA (Droplet Execution Agent). Diego handles the creation and management of the containers that stage and run applications. PCF operators can use the Ops Manager to choose which runtime backend they want to use—the Guardian backend for Linux or Garden Windows for Windows (or both).


To accomplish its orchestration tasks, Diego relies on three main components:

the Diego Brain

Before discussing these Diego components, it’s important to understand tasks and Long-Running Processes (LRPs), as these concepts are fundamental to how Cloud Foundry runs applications.

Tasks and LRPs

Pivotal Cloud Foundry translates incoming application-specific requests and processes into generic, abstracted tasks or Long-Running Processes (LRPs). This abstraction lets PCF forward all requests and processes to Diego so that it can schedule and assign them to host VMs known as cells for execution.

Tasks are one-off processes—that is, tasks are inherently terminating and provide a success/failure response. A task could be a database migration, or an initial service setup (for example, staging tasks for getting an application up and running). Containers running tasks are destroyed once the task is complete.

LRPs are one or more application instances or other processes that are generally continuous, scalable, and always available. PCF attempts to ensure high availability and resilience by maintaining multiple running instances of the same LRP across availability zones.

Diego Brain

The main purpose of the Diego Brain is to schedule and assign incoming requests to the cells for execution. The primary component responsible for this is the Auctioneer. The Auctioneer receives work from the Cloud Controller via the BBS (discussed below). It then communicates with a cell via the cell’s Rep (the point of contact between the cell and the rest of the deployment) to auction off this work after the Stager translates it into Tasks and LRPs. 

This process lets PCF balance load and maintains high availability as much as possible. The Auctioneer attempts to assign tasks and LRPs in batches. Its auction algorithm operates with the following descending priorities: ensure that all LRPs have at least one instance running at all times; assign all tasks/LRPs in the current batch; distribute process load across running instances and availability zones to ensure availability.

The Diego Brain’s other components carry out additional functions, including maintaining correct LRP counts, storing and handling resources required for application staging, or providing SSH access to application containers.

Loggregator

Loggregator is a system that aggregates and streams logs and (despite its name) metrics from PCF infrastructure components as well as instrumented applications running on the deployment. Logs and metrics in PCF come in a few different flavors. Loggregator collects platform metrics from PCF components, system-level resource metrics from the VMs, and application logs. Loggregator will also collect available metrics from installed services. That is, some PCF add-on products (for example, Redis) publish their own metrics, which are then forwarded to Loggregator and included in a data stream called the Firehose.

Loggregator translates, or “marshalls,” messages into envelopes based on the log or metric type. This process uses the dropsonde-protocol, which abstracts and standardizes metrics and event metadata to be processed downstream. For application logs and metrics coming from Diego cells, this process is handled by the Diego Executor, which reads off of the containers’ stdout and stderr outputs. Other PCF components forward metrics via the StatsD protocol to a StatsD-injector that translates them into dropsonde. These translated logs and metrics are then sent to processes that run on the VMs called Metron Agents, which forward them using gRPC to Doppler servers.

Dopplers separate and package envelopes coming in from the Metrons into protocol buffers, sometimes called sinks, based on the envelope type. As of version 2 of the Loggregator API, there are five types of envelopes: log, counter, gauge, timer, or event. (See the Loggregator API documentation for more information.)

Dopplers hold messages temporarily before sending them on to one of two destinations. By default, they go to a Traffic Controller, which aggregates everything from all availability zones before shuttling it on in a stream that is accessible either via the CF logging API (accessed by cf logs) or via the Firehose (covered below).

The Dopplers can also send logs to a Reverse Log Proxy (using gRPC), which is colocated on the Traffic Controller VM and forwards them to a Syslog Adapter that transforms the messages into standard Syslog format. They then can be accessed by one or more third-party Syslog drains that users can bind to an application. Binding a drain is accomplished by deploying a Syslog drain release.

By default, Dopplers are unable to store logs.* If logs coming from a specific buffer cannot be consumed as quickly as they are being produced or forwarded, the information will simply be wiped to make room for the next batch of logs. This means that scaling each component properly is important. Having an appropriate number of Metrons, Dopplers, and Traffic Controllers will help ensure no logs are dropped on the floor.



Relevant Blogs:

Basics of Jenkins 

Aws security group 

Ansible Tower 

Linux Containers (LXC)

Recent Comments

No comments

Leave a Comment