Friday, October 28, 2016

CISO Mindmap - Business Enablement

While doing some research about CISO function, noticed a very good MindMap created by Rafeeq Rehman.

While what he has come up with is mindmap, I will try to deconstruct this mindmap to elaborate more about the various functions performed by CISO.

Let's begin:
  1. Business Enablement
  2. Security Operations
  3. Selling Infosec (internally)
  4. Compliance and Audit
  5. Security Architecture
  6. Project Delivery lifecycle
  7. Risk Management
  8. Governance
  9. Identity Management
  10. Budget
  11. HR and Legal 
So why I numbered them and in the order? 

I believe Business Enablement is the most important function of a CISO. If (s)he doesn't know the business where (s)he operates, it will be a very difficult job to continue his duties as CISO. Consider a person coming from a technology background with no knowledge of Retail Business. If that person is hired as a CISO because (s)he knows the technology, that may not be a good deal. The only reason to become a successful CISO, one must know which business he is involved in. To understand the security function, he must understand the business climate.

If this retail business has a requirement of storing credit card information into their systems, CISO's job is to make sure appropriate PCI-DSS controls are in place so the data doesn't get into the wrong hands. While at the same time, making sure that PCI-DSS is not coming into the way of enabling the business to accept credit cards transactions. Yes, security is a requirement but not at the cost of not doing business.

That's why I rate business enablement as a very important function as a CISO.

What are some of the way CISO can enable business to adopt technology and still not come in their way?
  • Cloud Computing
  • Mobile technologies
  • Internet of things
  • Artificial Intelligence
  • Data Analytics
  • Crypto currencies / Blockchain
  • Mergers and Acquisitions

We will review each of these items in details in the following blog posts. 

Friday, July 1, 2016

CIS: Center for Internet Security

CIS:
Center for Internet Security -  "The Center for Internet Security mobilizes a broad community of stakeholders to contribute their knowledge, experience and expertise to identify, validate, promote and sustain the adoption of cybersecurity's best practices!"

Two resources of interest:

  • Secure Configuration Guides (aka "Benchmarks")
  • "Top 20" Critical Security Controls (CSC)
Benchmarks vs. Critical Security Controls:
  • Benchmarks are technology specific checklists that provide prescriptive guidance for secure configuration
  • CSCs are security program level activities:
    • Inventory your items
    • Securely configure them
    • Patch them
    • Reduce privileges
    • Train the humans
    • Monitor the access

CIS Benchmarks: 
  • 140 benchmarks available here
  • AWS CIS Foundations Benchmark here

Saturday, April 23, 2016

TPM (Trusted Platform Module)

TPM or Trusted Platform Module as referred by TCG (Trusted Computing Group)  is a microcontroller used in Laptop and now also on servers to ensure the integrity of the platform. TPM can securely store artifacts used to authenticate the platform. These artifacts can include passwords, certificates, or encryption keys. A TPM can also be used to store platform measurements that help ensure that the platform remains trustworthy. Authentication (ensuring that the platform can prove that it is what it claims to be) and attestation (a process helping to prove that a platform is trustworthy and has not been breached) are necessary steps to ensure safer computing in all environments.


source: http://www.trustedcomputinggroup.org
Above image depicts the overall function of TPM module. Standard use case I have seen is ensuring secure boot process of servers. Secure boot will validate the code run at each step in the process, and stop the boot if the code is incorrect. The first step is to measure each piece of code before it is run. In this context, a measurement is effectively a SHA-1 hash of the code, taken before it is executed. The hash is stored in a platform configuration register (PCR) in the TPM.
 TPM 1.2 only support SHA-1 algorithm 

Each TPM has at least 24 PCRs. The TCG Generic Server Specification, v1.0, March 2005, defines the PCR assignments for boot-time integrity measurements. The table below shows a typical PCR configuration. The context indicates if the values are determined based on the node hardware (firmware) or the software provisioned onto the node. Some values are influenced by firmware versions, disk sizes, and other low-level information.

Therefore, it is important to have good practices in place around configuration management to ensure that each system deployed is configured exactly as desired.

Register What is measured Context
PCR-00 Core Root of Trust Measurement (CRTM), BIOS code, Host platform extensions Hardware
PCR-01 Host platform configuration Hardware
PCR-02 Option ROM code Hardware
PCR-03 Option ROM configuration and data Hardware
PCR-04 Initial Program Loader (IPL) code. For example, master boot record. Software
PCR-05 IPL code configuration and data Software
PCR-06 State transition and wake events Software
PCR-07 Host platform manufacturer control Software
PCR-08 Platform specific, often kernel, kernel extensions, and drivers Software
PCR-09 Platform specific, often Initramfs Software
PCR-10 to PCR-23 Platform specific Software

So there are very good use case of TPM to ensure secure boot and integrity of hardware - who all are using TPM? There are many institutions who runs their private clouds have been seen using TPM chipset on their servers while many public clouds do not support TPM - why? that's mystery!


Monday, April 11, 2016

Hadoop Stack

In this post, I am exploring Hadoop stack and it's ecosystem.

Hadoop:


Oozie:

Oozie is a server-based workflow engine specialized in running workflow jobs with actions.  It is typically used for managing Apache Hadoop Map/Reduce and Pig Jobs. In Oozie, there are workflow jobs and Coordinator jobs. Typically workflow jobs are Directed Acyclical Graph (DAG) of actions while coordinator jobs are recurrent Ozzie workflow jobs which are triggered by time (or frequency) and based on data availability.

Due to Oozie's integration with rest of the Hadoop stack, it is easy to support several types of Hadoop jobs out of the box.

From a product point of view, it's a Java Web Application that runs on Java Servlet container. In Oozie, a workflow is a collection of actions (Hadoop Map/Reduce jobs, Pig jobs) arranged in control dependency DAG (Direct Acyclic Graph)... Here control dependency dictates that from one action to another action - but second action can't run until the first action is completed.

These workflow definitions are written in hPDL (Process Definition Language). Oozie workflow actions start their jobs in remote systems (like Pig, Hadoop etc.). Once completed, remote systems callback Oozie to notify the action completion and then Oozie proceeds to the next actoin in workflow.





credit: https://oozie.apache.org/docs/4.2.0/DG_Overview.html


From Stackoverflow: DAG (Direct Acyclic Graph)

Graph = structure consisting of nodes, that are connected to each other with edges.
Directed  = The connections between nodes (edges) have a direction: A --> B is not the same as B -> A.
Acyclic = "non-circular" = moving from node to node by following the edges, you will never encounter the same node for the second time.

A good example of a directed acyclic graph is a tree. Note, however, not all directed acyclic graphs are trees :)



Monday, March 14, 2016

Bare Metal - A dreary (but essential) part of Cloud

Recently I got a chance to attend Open Compute Summit 2016 in San Jose, CA. It was full of industry peers from web scale companies such as Facebook, Google, Microsoft along with many financial institutions like Goldman Sachs, Bloomberg, Fidelity, etc. Overall theme of this summit was to embrace the openness in hardware and embrace commodity hardware. 
From historical point of view, OCP was a project initiated by Facebook few years ago where they opened many of the hardware components - motherboard, power supply, chassis, rack, later switch etc. as they needed things at scale and doing it using branded servers (pre-cut for enterprise by HP, Dell, IBM) wasn’t going to cut for them - thus they created (designed) their own gears. More details here.  
Below is one of the OCP certified server (courtesy: http://www.wiwynn.com). It features very minimalistic feature and a stripped down version of typical Rack Mount server.
Coming back to this year’s summit, considering this was my first year at OCP summit, I had certain expectations and while being there I can say one thing for sure - “Bare Metal does look interesting again”. Why I say that? If it was only about Bare Metal, it certainly a boring thing but when you combine that bare metal with API and particularly if you are operating at a scale (doesn’t have to be at Facebook scale), it’s fun time. Let’s take a look.
Keynote started by Facebook’s Jason Taylor with journey over last year or so and where the community stands now. But fun begun when (another Jason) Jason Waxman from Intel talking about their involvement and how the server and storage (think NVMe) industry is growing and what they see coming in future - including Xeon D and Yosemite.

A good talk was given by Peter Winzer of Bell Labs. I knew UNIX and C born out of Bell Labs but it was fascinating to hear about the history and future of Bell Labs with innovations going in Fiber Optics and capacity of Fiber - with 100G is no brainer but 1Tbps is in the horizon. 

 Microsoft Azure’s CTO Mark Russinovich started discussing about how open Microsoft is - which to be honest other than their .NET framework being open, I had no idea that they have been contributing back to Open Source community - well, it’s a good thing!  In past Microsoft has contributed their server design specs - Open Cloud Server (OCS) and Switch Abstraction Interface (SAI). OCS is the same server and data center design that powers their Azure Hyper-Scale Cloud (~ 1M servers). Using SAI and available APIs help network infrastructure providers integrate software with hardware platforms that are continually and rapidly evolving at cloud speed and scale. For this year, they have been working on a network switch and proposed a new innovation for OCP inclusion called Software Open Networking in the Cloud (SONiC). More details here. 

There were many interesting technologies showcased in Expo but one struck my mind was Storage Archival Solution. This basic configuration can hold 26,112 disks (7.8 PB) with expandable modules spanning pair of datacenter row gives total capacity of up to 181 petabytes (HUGE!!).  Is AWS Glacier running this underneath? Some details here.
For a coder at heart, it was good demonstration by companies such as Microsoft and Intel showing some love for OpenBMC to manage the bare metal. Firmware update seems to be common pain across industries but innovative approach taken by Intel and Microsoft using Capsule - which bring API and Envelop via UEFI - try to make it easier than it seems. 
Overall, it was a good exposure to newer generation of hardware technologies and by accepting contributions from multiple companies, OCP is moving towards standardization on hardware. With standardization and API integration, it will make fun to play with Bare Metal.
Do you still think Bare Metal is dreary?

This article originally appeared on LinkedIn under the title Bare Metal - A dreary (but essential) part of Cloud

Monday, January 4, 2016

Log Management

What are available options for Log Management?


There are logs everywhere - systems, applications, users, devices, thermostats, refrigerators, microwaves - you name it.. and as your deployment grows, your complexity increases. When you need to analyze a situation or an outage, logs are your lifesaver.
There are tons of tools available - open-source, pay-per-use and few others.. Let's take a look at some of them here:



What are different tools/framework available to store these logs and analyze the logs - may be in real time, if not, after-the-fact analysis?



Splunk:



Splunk is a powerful log analysis software with choice of running in enterprise data center or over a cloud.

1. Splunk Enterprise: Search, monitor and analyze any machine data for powerful new insights.

2. Splunk Cloud: This provides Splunk enterprise and all it's feature in a SaaS way over the cloud. 


3. Splunk Light: At a miniature scale of Splunk Enterprise - Log search and analysis for small IT environments


4. Hunk: Hunk provides the power to rapidly detect patterns and find anomalies across petabytes of raw data in Hadoop without the need to move or replicate data.



Apache Flume: 

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application. 


Flume deploys as one or more agents, that's contained within it's own instance of JVM (Java Virtual Machine). Agents has three components: sources, sinks, and channels. An agent must have at least one of each in order to run. Sources collect incoming data as events. Sinks write events out, and channels provide a queue to connect the source and sink. Flume allows Hadoop users ingest high-volume streaming data directly into HDFS for storage. 



credit: flume.apache.org





Apache Kafka:


Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. Kafka is fast, scalable, durable and distributed by design. This was a LinkedIn project at some point later open-sourced and now one of the top-level Apache open source project. There are many companies who has deployed Kafka in their infrastructure. 

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.
  • Kafka maintains feeds of messages in categories called topics.
  • We'll call processes that publish messages to a Kafka topic producers.
  • We'll call processes that subscribe to topics and process the feed of published messages consumers..
  • Kafka is run as a cluster comprised of one or more servers each of which is called a broker.
So, at a high level, producers send messages over the network to the Kafka cluster which in turn serves them up to consumers like this:

credit: kafka.apache.org

Kakfa has a good ecosystems surrounding the main product. With wide range of choice to select from, it might be a good "free" version of log management tool. For a large systems deployments, Kakfa can act as a broker with multiple publishers - may be from Syslog-ng (with agent running on each systems), FluentD (again, with fluentd agents running on nodes and plugin on Kakfa) may solve the purpose of log collections. With log4j appender, it might be extremely easy for applications which uses log4j framework, use it seamlessly. Once you have logs ingested via these subsystems, searching logs can be cumbersome.  With Kafka, there are some alternatives where you can dump these data into HDFS and run a Hive query against it and voila, you get your analysis. 

Still there is some work to be done in terms of how easily someone can retrieve it like via Kibana dashboard.

ELK:


When we are talking about logs, how can we not remember ELK stack. When I got introduced to ELK stack, it was presented as a Splunk alternative as open source. I agree, it does have the feature sets to complete against core splunk product and if there is a right sizing (think: small, medium) involved, we don't need Splunk at all and ELK stack might be good enough. Though in recent usage, we have found some scalability issues when we reach few hundred gigs of logs per day. 


Though one good feature I like of ELK stack is all-in-one. I have my log aggregator, search indexer and dashboard within one suite of application. 



With so many choices, it becomes difficult to rely on one or the other. If someone has enough money to spend Splunk might be the right choice but if someone can throw a developer at it, either ELK stack or Kafka - depends on the scale at which they are growing, might be better off.






Saturday, August 15, 2015

Amazon Web Services (AWS) Risk and Compliance

This is a summary of AWS’s Risk and Compliance White Paper

AWS publishes SOC1 report - formerly known as Statement on Auditing Standards (SAS) 70, Service Organization report, widely recognized auditing standard developed by AICPA (American Institute of Certified Public Accountants). 

SOC 1 audit is an in-depth audit of design and operating effectiveness of AWS’s defined control objectives and control activities. 

Type II - refers that each of the controls described in reports are not only evaluated for adequacy of design, but are also tested for operating effectiveness by the external auditor. 

With ISO 27001 certification AWS is complying with a broad, comprehensive security standard and follows best practices in maintaining a secure environment. 

With PCI Data Security Standards (PCI DSS), AWS is complying with set of controls important to companies that handle credit card information. 

With AWS’s compliance with FISMA standards, AWS complies with wide range of specific control requirements by US government agencies. 

Risk Management:
AWS management has developed a strategic business plan which includes risk identification and the implementation of controls to mitigate and manage risks. Based on my understanding, AWS management re-evaluate those plans at least twice a year. 

Also, AWS compliance team have adopted various Information Security and Compliance framework - including but not limited to COBIT, ISO 27001/27002, AICPA Trust Service Principles, NIST 800-53 and PCI DSS v3.1. 

Additionally, AWS regularly scan all their Internet facing services for possible vulnerabilities and notified parties involved in remediation. External Pen Test (VA test) are also performed by reputed independent companies and repots are shared with AWS management. 

Reports/Certifications:

FedRAMP: AWS is Federal Risk and Authorization Management Program (FedRAMPsm) compliant Cloud Service Provider. 

FIPS 140-2: The Federal Information Processing Standard (FIPS) Publication 140-2 is a US government security standard that specifies the security requirements for cryptographic modules protecting sensitive information. AWS is operating their GovCloud (US) with FIPS 140-2 validated hardware. 

FISMA and DIACAP:
To allow US government agencies to comply with FISMA (Federal Information Security Management Act), AWS infrastructure has been evaluated by independent assessors for a variety of government systems as part of their system owner’s approval process.
Many agencies have successfully achieved security authorization for systems hosted in AWS in accordance with Risk Management Framework (RMF) process defined in NIST 800-37 and DoD Information Assurance Certification and Accreditation Process (DIACAP).

HIPPA:
Leveraging secure AWS environment to process, maintain and store protected health information, AWS is enabling entities to work in AWS cloud who need to comply with US Health Insurance Portability and Accountability Act (HIPPA). 

ISO 9001:
AWS has achieved ISO 9001 certification to directly support customers who develop, migrate and operate their quality-controlled IT systems in AWS cloud. This allows customers to utilize AWS’s compliance report as evidence of their ISO 9001 programs for industry specific quality programs such as ISO/TS 16949 in auto sector, ISO 13485 in medical devices, GxP in life science, AS9100 in aerospace industry. 

ISO 27001:
AWS has achieved ISO 27001 certification of their Information Security Management Systems (ISMS) covering AWS infrastructure, data centers, and multiple cloud services. 

ITAR:
AWS GovCloud (US) supports US International Traffic in Arms Regulations (ITAR) compliance. Companies subject to ITAR export regulations must control unintended exports by restricting access to protected data to US persons and restricting physical location of that data to US. AWS GovCloud provides such facilities and comply to the required compliance requirements. 

PCI DSS Level 1:
AWS is level 1 compliant under PCI DSS (Payment Card Industry Data Security Standards). Based on February 2013 guidelines by PCI Security Standards Council, AWS incorporated those guidelines in AWS PCI Compliance Package for customers. AWS PCI Compliance package include AWS PCI Attestation of Compliance (AoC), which shows that AWS has been successfully validated against standard applicable to a Level 1 Service Provider under PCI DSS Version 3.1.

SOC1/SOC2/SOC3:
AWS publishes Service Organization Controls 1 (SOC 1), Type II report. Audit of this report is done in accordance with AICPA: AT 801 (formerly SSAE 16) and International Standards for Assurance Engagements No. 3402 (ISAE 3402). 

This dual report intended to meet a broad range of financial auditing requirement of US and international bodies. 

In addition to SOC 1, AWS also publishes SOC 2, Type II report - that expands the evaluation of controls to the criteria set forth by the AICPA Trust Service Principles. These principle defines leading practice controls relevant to security, availability, processing integrity, confidentiality, and privacy applicable to service organization such as AWS. 

SOC 3 report is publicly-available summary of AWS SOC 2 report. The report includes the external auditor’s opinion of the operation of controls based on (AICPA’s Security Trust Principle included in SOC 2 report), the assertion from AWS management regarding effectiveness of controls, and overview of AWS infrastructure and Services.