Google Zanzibar V.s. Cedar

Cedar and Google Zanzibar are both systems designed to manage access control within software applications or systems, but they may have different architectures, features, and capabilities.

  1. Architecture:
    1. Cedar Authorization: The architecture of Cedar authorization would depend on the specific implementation and design choices made by its developers. Depending on the application’s requirements, it could be a centralized or decentralized system.
    2. Google Zanzibar: Zanzibar is a centralized system for managing access control in distributed systems, developed by Google. It uses a global namespace and relies on a centralized policy decision point for access control decisions.
  2. Scalability:
    1. Cedar Authorization: Scalability would depend on the scalability features and design decisions implemented within Cedar. It may be scalable depending on its architecture.
    2. Google Zanzibar: Zanzibar is designed for scalability and manages access control at Google’s scale, handling millions of authorization requests per second.
  3. Features and Capabilities:
    1. Cedar Authorization: Features and capabilities would depend on the specific implementation of Cedar. It may offer features such as role-based access control (RBAC), attribute-based access control (ABAC), and fine-grained access control policies.
    2. Google Zanzibar: Zanzibar provides features for managing authentication, authorization, and auditing in large-scale distributed systems. It offers a global namespace for managing access control policies, supports complex access control rules, and provides auditing capabilities.
  4. Usage and Adoption:
    1. Cedar Authorization: The usage and adoption of Cedar would depend on factors such as its availability, ease of integration, and suitability for different types of applications.
    2. Google Zanzibar: Zanzibar is used internally by Google to manage access control across its various services and applications.

In summary, while both Cedar authorization and Google Zanzibar are systems for managing access control, they may differ in their architecture, scalability, features, and usage. The specific differences would depend on the details of each system’s implementation.

Okta Customer Identity

Focuses on Security and Privacy of Customers.

Challenges and Solutions

Screen Shot 2020-04-13 at 1.36.35 PM

  1. Build and integrate Apps with Okta AuthN 
    • SSO
    • Social Logins
    • Custom user experience
      • Signin widget
    • Prebuilt & custom process
      • Email template
      • Event hooks
  2. Secure APIs with Okta Authorization (API Access Management) 
    • OAuth/ OIDC Protocols
      • Screen Shot 2020-04-13 at 1.10.26 PM
    • Identity Proven Policy
      • Screen Shot 2020-04-13 at 1.12.27 PM
    • Plug & Play SDK/ APIs
      • Screen Shot 2020-04-13 at 1.13.53 PM
  3. Integrate Enterprise Identity
      • Screen Shot 2020-04-13 at 1.18.10 PM
  4. Protect against Risk of Account Takeover

Screen Shot 2020-04-13 at 1.19.36 PM

    • Risk-based Authentication
      • Screen Shot 2020-04-13 at 1.21.43 PM
    • Passwordless Authentication
      • Screen Shot 2020-04-13 at 1.22.20 PM
    • Pre-authentication sign-on policy evaluation
      • Screen Shot 2020-04-13 at 1.23.08 PM

Common Use-cases

Screen Shot 2020-04-13 at 1.54.26 PM

Screen Shot 2020-04-13 at 1.55.29 PM

Screen Shot 2020-04-13 at 1.56.42 PM

Screen Shot 2020-04-13 at 1.57.36 PM

API Access Management

API Access Management allows you to build custom authorization servers in Okta which can be used to protect your own API endpoints.

JSON Web Key Set (JWKS) is a set of keys which contains the public keys used to verify any JSON Web Token (JWT) issued by the authorization server and signed using the RS256 signing algorithm.

  • JSON Web Key (JWK) is a JSON representation of a cryptographic key.
  • Okta can use these keys to verify the signature of a JWT when provided for the private_key_jwt client authentication method or for a signed authorize request object.
  • Okta supports both RSA and Elliptic Curve (EC) keys.

Okta Workforce Identity

Screen Shot 2020-04-05 at 2.50.25 PM

IAM Challenges

  1. User Password Fatigue
  2. Failure-Prone Manual Provisioning & De-Provisioning Process
  3. Compliance Visibility: Who Has Access to What?
  4. Siloed User Directories for Each Application
  5. Managing Access across an Explosion of Browsers and Devices
  6. Keeping Application Integrations Up to Date
  7. Different Administration Models for Different Applications
  8. Sub-Optimal Utilization, and Lack of Insight into Best Practices

How do companies currently manage their user identities?

  1. IT-Managed Lifecycle Management
  2. HR-Managed Lifecycle Management

Screen Shot 2020-04-05 at 2.21.37 PMScreen Shot 2020-04-05 at 2.28.16 PM

Screen Shot 2020-04-05 at 2.28.53 PM

Screen Shot 2020-04-05 at 2.45.57 PM

Screen Shot 2020-04-12 at 8.28.58 PM

Universal Directory

UD has following profiles:

  1. Okta User Profiles
    • Every kind of account, no matter how they are mastered, has an Okta user profile.
    • The base Okta user profile has about 31 attributes by default which cannot be removed, but can add as many custom attributes.
  2. Directory Profiles
    • If your accounts are mastered by AD, we can map the AD profile attributes to the Okta user profile attributes as needed.
    • Schema Discovery to get AD attributes to Okta.
    • When you set up an integration with AD, you see that Okta has pre-selected several of the most common attributes in AD to bring into Okta. This is done when setting up the directory user profile during Step 3 of the Active Directory integration process.
    • Supports custom attribute as well.
  3. Application Profiles
    • Box has 4 attributes in its user profile.
    • Google has over 200 attributes in its user profile.
    • Salesforce, which has a custom schema, can have all different kinds of attributes.
    • So these application user profiles can really look drastically different than say the Okta user profile because the information they need will vary depending upon the application.
    • Application User Profiles that are UD Enabled will have the ability to perform schema discovery, enabling Okta to read the attributes in the application profile and connect them to Okta.
  4. Identity Providers Profiles
    • There may be profiles you see for IdPs such as in a Business to Consumer (B2C) scenario where we want to let our users login with an application like Facebook.
    • Facebook can become the IdP thru the notion of social authentication.
    • Okta becomes the Service Provider, offering an enterprise grade solution around this scenario.

Profile Mappings, Profile Masters, Profile Editor

  • Profile Mappings -> Mapping between source and target and vice-versa.
    • Schema Discovery discovers attribute-metadata from source or target apps.
    • Expression language useful to transform the attribute in and out of Okta.
  • Profile Editor -> Helps to add/ edit/ delete custom attributes and edit regular attributes.
  • Profile Masters -> Tells which is a user source of truth for user profile attributes in Okta. eg: Okta, Directory or App.
    • Currently, if more than one profile master exists on the Profile Masters page, they can be prioritized so that end users can be mastered by different systems, based on their assignments.
    • At any given time, there can only be one profile master that masters a user’s entire profile.
      • A user can only be mastered by a single application or directory at any one time.

 

Attribute-Level Mastering

  • Attribute Level Mastering (ALM) delivers finer-grain control over how profiles are mastered by letting you specify different profile masters for individual attributes.
  • Without it, all of a user’s attributes are mastered by a single profile master.
  • Setting up ALM:
    1. Enable Profile Mastering
    2. Prioritized profile masters on the Profile Masters page
      • From the Directory drop-down menu, select Profile Editor.
        • From the Master priority drop-down list, select one of the following:
          • Inherit from profile master: Picks up the default profile master for the entire profile, as shown in the Profile master priority field.
          • Inherit from Okta: Picks up this particular attribute value from Okta. This attribute value can be edited in three ways: via the user’s Profile tab, the Okta API or, if appropriate for end-user modification, by the end user.
          • Override profile master: Overrides the default profile master. Click the Add Master drop-down menu to choose another available profile master. Note that this option does not generally disable the app as a master.

    3. Desired mappings are specified through UD mapping
      • The optional third step of setting up ALM is to map the attribute through UD. If you changed an attribute’s Master priority to Override profile master, for example, the attribute now has a null value and must be mapped. To map the attribute, do the following:
      • User-added image
Example Profile Master Set
Profile master:
Default master for the entire profile.
WorkdayActive Directory
Attribute master: Alternative master for a particular attribute. 3rd attribute: mobile phone = Active Directory

All other attributes: Workday

Example Attributes
First name Workday
Last name Workday
Mobile phone Active Directory
Work phone Workday
  • App (master) <-> Okta <-> App or Okta (master) <-> App <-> Okta 
  • Enabling Profile Master and Update User Attributes for the same application allows to push Okta to App profile mappings to your highest priority profile master.
  • This is beneficial when you want to sync attributes from downstream applications back to the profile master, like an email address and phone number from an app back to the master Workday profile.
  • However, you may lose data if an app that you designate as profile master can also receive profile updates from Okta.
  • Caution: Enabling both Profile Master and Update User Attributes for the same app may result in:
    • Unwanted profile pushes – Okta updates can overwrite the values of unmapped attributes in an app, even if that app is the highest priority profile master. For example, if the cn attribute is not mapped from Active Directory to Okta, and you’ve configured Active Directory for Profile Master and Update User Attributes, Okta will apply a default mapping to cn.
    • Overwritten IdP-mastered attributes – Okta to App updates can overwrite attributes that are mastered by another identity source. There’s no partial push option.
    • Race conditions – Okta can overwrite an updated attribute in an identity source before other updates are pushed back to Okta.
      • For example, consider a scenario in which a user’s first name and last name are imported into Okta from a directory, but the user’s email address is imported into Okta from an app. If the user’s last name changes in the directory before the applicable email address update is made in the app, Okta could push the new name and the old email address to App.

Username Overrides

  • Username override can also be used with Selective Attribute Push to continuously update app user names as user profile information changes.
  • For example, if a user gets assigned to an app with a username of email, and that email subsequently changes, Okta can automatically update the app username to the new email.
  • Prior to this enhancement, an Okta admin had to manually change a user’s app username by unassigning the user and reassigning him to the app.
  • This enhancement applies to all apps and is not limited to only apps with provisioning capabilities.

Overriding the App username

  • For the Okta to App flow, you can no longer override username mappings in Profile Editor.

Screen Shot 2020-05-04 at 10.21.30 AM.png

  • The username mapping displayed in the app’s Sign On tab will be the source of truth for the Okta To App flow.
  • Updating the username mapping on Create only or Create and Update will also be managed from the app’s Sign On tab.

Screen Shot 2020-05-04 at 10.23.27 AM.png

Overriding an Okta username

Screen Shot 2020-05-04 at 10.21.46 AM.png

Exclude username updates during provisioning

User-added image

People

There are 3 types of people or user accounts that can exist within Okta:

  1. Okta-Mastered people
  2. Directory-Mastered people
  3. Application-Mastered people
  1. Okta-Mastered People can only be added to Okta groups (groups created in Okta), as opposed to groups that have been imported from an external directory service such as Active Directory Groups or LDAP groups.
    • Login via Okta Password policy and they are governed by the Okta User Profile, which is to say that the primary source of truth for the user’s attributes is the Okta User Profile.
    • Any active user can be assigned an Okta Administrator role (both Okta-Mastered and Directory-Mastered people, are eligible for admin role assignment).
    • Okta Super Admin and Org-Admin are in charge of managing the password policy for Okta-Mastered accounts.
    • When importing people records via the CSV template, you cannot import more than 10k records per file, and the file size cannot exceed more than 10MB.
    • As a best practice, you will need to create at least two Okta Admin Service Accounts and assign Super Admin rights to the accounts.
  1. Directory-Mastered people log into Okta using the same credentials they would use to log into their on-premise workstation.
    • Directory-Mastered people are governed by the Directory user profile—management of user attributes must be done in the directory service.
    • By default, Directory-Mastered people cannot change their directory-linked password within Okta.
      • However, an Okta Super Administrator can configure the Directory Password policy in Okta to enable this ability.
    • Directory-Mastered people can be associated with both Okta groups and Directory groups.
    • AD app user profile schema requires both the first and last name.
    • You can create an Okta mastered user without a first or last name, but you cannot import an AD user into Okta without a first and last name.
    • Delegated authentication and Just-in-Time provisioning will automatically be enabled as part of your AD/ LDAP installations.
    • System Requirements 
      • Windows server must be running Windows Server 2008 R2 or newer and have at least 256 MB of RAM available.
      • The host server of the agent must also be a member of the same domain as the Active Directory users.
      • The Windows server where the agent resides must be on at all times. In other words, don’t install it on your own desktop or laptop. The agent host server must have a continuous connection to the internet so it can communicate with Okta.
      • Best practice: ensure high availability and provide a failover if need be, you should install at least 2 Okta Active Directory agents on two different servers connected by any one domain.
        • That is to say 2 agents for every domain. You can either install the Okta Active Directory agent on domain controller and/or any two active servers connected to domains.
    • OUs
      • Organize people and groups into AD OUs.
      • Okta will only import users accounts and groups within OUs.
    • User Accounts for AD agent
      • A local AD domain administrator account to run the Active Directory agent installer; this is only required for the installation.
      • An Okta Super Admin account.
      • An Okta-AD service account which is created via the integration wizard and becomes a service account that the service runs after installation.
        • This account serves as the liaison between AD and Okta, and by default has read only access to AD.
        • It is the AD Agent that will pull information from AD into Okta and keep that information synchronized.
    • User Accounts for LDAP agent
      • A local LDAP service account which will be created during the installation process of the LDAP agent.
      • An Okta administrator account which will be used to connect Okta to the LDAP agent.
      • A LDAP user which must have the ability to look up users and groups or roles within the Directory Information Tree.
Screen Shot 2020-04-05 at 8.00.34 PM

 

JIT

  • JIT account creation and activation only works for end users not already Okta.
  • JIT updates the accounts of existing end users during full imports.
    • This means that end users who are confirmed on the import results page, regardless activated or not-activated are not eligible for JIT activation.
    • When JIT is enabled, users do not receive activation emails.
  • If delegated authentication is enabled, NO need to import users from AD first for JIT provisioning to create Okta accounts.
  • If NO delegated authentication enabled, MUST import the AD accounts first, and they must appear on the imported users list for JIT provisioning to create Okta accounts.
  • JIT automatically:
    • Confirms and activates the account.
    • Updates any attributes for accounts already activated, when the person next logs into Okta.

Schedule import allows how often you want to run scheduled imports of people records from Active Directory into Okta.

  • With each import:
    • Any user attributes that may have changed in AD will be updated.
    • Any new user accounts created in AD since last import will be imported.
  • Always manually initiate the importing process from the Import tab.
Import Matching Rules has two steps:
  1. Matching the imported users:
    • Exact match:
      • username
      • email
      • attribute (base or custom)
      • attribute combination
    • Partial match:
      • Partial matching of:
        • firstName
        • lastName
      • No match of:
        • username and/or
        • email
  2. Confirm the imported users:
    • Confirm matched users: Select to automate the confirmation or activation of existing users.
      • Unchecked, matches are confirmed manually.
    • Confirm new users: Select to automate the confirmation or activation of a newly imported user.
      • If this option is selected, you can uncheck it during import confirmation.
      • Note that this feature does not apply for users who already exist in Okta.
    • If activated JIT, no need to confirm or activate the users. Users are activated during signin.

Profile & Lifecycle Mastering

App to profile master Okta users.

Once enabled, the app appears in the list of profile masters on the Profile Masters page.

  • Allow <app> to master Okta users: Determine what happens when a user is deactivated or reactivated in an app.
    • Only the highest priority profile master can deactivate or suspend an Okta user.
  • When a user is deactivated in the app: Choose to: 
    • Deactivate
    • Suspend
    • Do nothing:
      • App controls user cycle but allows profile master of attributes and mappings.
  • When a user is reactivated in the app: Choose whether reactivation in the app applies to suspended or deactivated Okta users.
    • Reactivated in the app, the user profile must be an exact match to the Okta profile.
    • Otherwise, after importing the reactivated users, they appear in Pending Activation state.

Clearing unconfirmed users

Allows admins to clear all unconfirmed users within an import queue.

  • Impossible to select and remove specific users at this time.
    • The only option is to clear all users.
  • If an admin mistakenly clears all users from the queue, they can rerun a full import to restore the queue back to its prior state.
    • To restore the import queue, an incremental import will not suffice—a full import is required.
  • Also note that, if an existing (scheduled or manual) import is actively running, admins cannot clear users. The Clear Unconfirmed Users button is unavailable until that previous import is complete.
  • If a scheduled or manual import is started during a clearing process, it is queued up to begin as soon as the previous operation completes.

 

Import Users from App

Directory

Screen Shot 2020-04-05 at 8.46.20 PM

  • Directory Integration for All Cloud Apps
    • UD is made available to all Cloud Apps
    • For AD integration, Okta provides three lightweight and secure onpremises components:
      • Okta Active Directory Agent: A lightweight agent that can be installed on any Windows Server and is used to connect to on-premises Active Directory for user provisioning, deprovisioning, and authentication requests.
      • Okta Integrated Windows Authentication (IWA) Web Application: A lightweight web application that is installed on an IIS and is used to authenticate domain users via Integrated Windows Authentication.
      • Okta Active Directory Password Sync Agent: A lightweight agent installed on your domain controllers that will automatically synchronize AD password changes, send to Okta, and keep your user’s AD passwords in sync with the apps they use.
    • For LDAP integration, Okta provides a single lightweight and secure on-premises component:
      • Okta LDAP Agent: A lightweight agent that can be installed on any Windows Server and is used to connect to on-premises LDAP user stores for provisioning, de-provisioning, and authentication requests.
  • Simple and Secure Setup and Configuration
    • DelAuth, and JIT are turned on by default.
      • Users can immediately JIT in without any previous import (because of DelAuth)
    • On every DelAuth or JIT request, Group memberships are also imported.
    • Users are fully updated on every login and asynchronously.
    • Admins can change OUs, user profile and group information in AD and users will be fully updated in Okta.
  • Real Time Synchronization
    • Sync when authN.
    • Group memberships are imported in addition to the full User profile.
  • JIT User Provisioning
    • Possible of automatic user account creation in Okta the first time a user authenticates with:
    • JIT updates the accounts of existing end users during full imports.
      • This means that end users who are confirmed on the import results page, regardless of whether or not they were subsequently activated, are not eligible for JIT activation.
    • When JIT is enabled, users do not receive activation emails.
    • If delegated authentication is enabled,
      • Yes, no need to import users from AD first for JIT provisioning to create Okta accounts.
      • No, then must import the AD accounts first, and they must appear on the imported users list for JIT provisioning to create Okta accounts.
  • Delegated Authentication
    • AuthN at AD and it says YES or NO response for authN request.
  • Desktop Single Sign-On
  • Self Service Password Reset Support
    • Updates both AD and Okta password for a user.
  • Security Group (SG) –Driven Provisioning
    • Bulk provisioning/ deprovisioning possible.
  • Deprovisioning
  • SSO for Authenticated Apps resides inside On-prem
    • SSO to local AD apps using SWA after authN with Okta (IdP-init)

AD config in Okta

By default Okta uses the Okta user profile user name during delegated authentication. For example, if the AD app-user user name is samAccountName (domain\jdoeand the Okta user profile user name (login field) is UPN, then Okta use UPN (jdoe@domain.com) to log in the user.

Okta username format  — The username format you select must match the format you used when you first imported users. Changing the value can cause errors for existing users. Choose one of the following options:

  • Custom  — Select this option to use a custom user name to sign in to Okta. Enter the Okta expression language to define the Okta user name format. To validate your mapping expression, enter a user name and click the view icon.
  • Email address  — Select this option to use an email address for the Okta user name.
  • SAM Account name  — Select this option to combine the SAM Account Name and the AD domain to generate the Okta username. For example, if the SAM Account Name is jdoe and the AD domain is mycompany.okta.com, then the Okta username is jdoe@mycompany.okta.com.
  • SAM Account name + Configurable Suffix  — Select this option to combine the SAM Account Name and a configurable suffix to create the Okta user name. When using this option, do not include the @ character before the Configurable Domain.
  • User Principal Name (UPN)  — Select this option to use the UPN from AD to create the Okta user name.

Note: All Okta users can sign in by entering the alias part of their user names as long as it maps to a single user in your organization. For example, jdoe@mycompany.okta.com could sign in using jdoe.

USG support  — Select Universal Security Group Support to ignore domain boundaries when importing group memberships for your end users. This assumes that the relevant domains are connected in Okta. You must also deploy an AD agent for every domain in your forest that contains the USG object that you want to sync with Okta. Each connected domain then imports its groups. When a user’s group memberships match any groups that were imported (from any connected domain in the forest), Okta syncs the memberships for the user to each group.  Only groups from connected domains are imported. This setting requires JIT provisioning.

Max Import Unassignment  — Click edit to modify the value that the number of app unassignments stops. The default is 20%. This action prevents accidental loss of user accounts. This setting affects all apps with imports enabled. See Import safeguards.

Desktop Single Sign-On (DSSO)

Allow users to log in to their AD-connected computer and extend that Single Sign-On (SSO) experience through to Okta.

  • This means that logging into a user’s Windows machine gives them direct access to their Okta-configured applications, and reduces the number of logins needed to access both company and cloud-based applications to one.
  • The AD-version of this is also known as Integrated Windows Authentication (IWA).
  • Download and install Okta’s IWA web application, configure the relevant IP ranges, and the setup is complete.
  • DSSO is not supported for LDAP.

Two methodologies are available for DSSO implementation:

  • Agentless (recommended)
  • IWA web agent running on premises

Agentless Desktop Single Sign-on – uses Kerberos

  • You have an Active Directory domain (or multiple domains) integrated with your Okta org.
  • In order for Okta to negotiate Kerberos authentication for agentless DSSO, you need to create a new service account and set a Service Principal Name (SPN) for that service account. This will delegate Kerberos authentication to external service, rather internal Windows Kerberos authentication.

setspn -S HTTP/<myorg>.kerberos.<okta|oktapreview|okta-emea>.com <ServiceAccountName>>

  • Where HTTP/<myorg>.kerberos.okta.com is the SPN. <ServiceAccountName> is the value you used when configuring the Early Access version of Agentless DSSO and <oktaorg> is your Okta org (either oktapreview, okta-emea or okta). For example, setspn -S HTTP/atko.kerberos.oktapreview.com atkospnadmin.

Screen Shot 2020-04-14 at 3.09.39 PM

Okta IWA Web agent for Desktop Single Sign-on

Flow 1:

Flow 2:

In simple steps:

  1. User navigates to https://mycompany.okta.com.
  2. The user is redirected to the locally installed IWA web application.
  3. The IWA web application transparently authenticates the user via Integrated Windows Authentication (Kerberos).
  4. The user is redirected back to the Okta login page with cryptographically signed assertions containing his AD user identity.
  5. The Okta service validates the signed assertions and sends the user directly to his Okta home page.

Screen Shot 2020-04-14 at 5.51.25 PM

DSSO-JIT

  • For agentless DSSO, the web browser sends the Kerberos ticket to Okta, and relies on the AD agent to look up the UPN. Kerberos validation is done in Okta cloud.
  • For onpremises DSSO, IWA sends Okta an UPNOkta uses the UPN to locate a user. Okta does not perform user authentication because Kerberos validation is done on the IIS side.

When a user signs in using DSSO:

  • If their information is already imported and active in OktaOkta uses the UPN to look up the user. If Okta finds the user, the profile reloads from Active Directory (AD) and the user signs in successfully.
  • If their information has been imported but is not yet active in OktaOkta uses the UPN to look up the user. If Okta finds the user, the user profile loads from AD, the user is activated, and the user signs in successfully.
  • If their information has not been imported into Okta and JIT is enabledOkta can only validate the user by the UPN. So the Okta user name format must be the UPN in order for the user to be successfully imported with JIT enabled. If the user name format is something other than the UPN, for example the email address or SamAccountName, the search cannot locate the user.
  • If the their information has not been imported into Okta and JIT is off: Sign in fails.

MFA

  1. Knowledge factors based on something the user knows: Passwords
  2. Possession factors based on something the user has: Yubikey
  3. Biometric factors based on something the user is: Fingerprint

Factor Types

Okta’s Multi-Factor Authentication can be divided into 4 categories of 2FA types:

  1. Security Question
  2. Soft-Based Token
    1. Okta Verify
    2. Google Authenticator (Time-based One-time Password Algorithm-TOTP)
  3. Phone
    1. SMS Authentication
    2. Voice Call Authentication
  4. Third Party
    1. Symantec VIP
    2. RSA SecurID
    3. Duo Security
    4. Yubikey
    5. U2F (2FA-open standard by FIDO Alliance)
    6. WebAuthn (2FA-web standard by FIDO Alliance)
    7. Windows Hello (by MSFT)

Screen Shot 2020-04-15 at 11.44.06 AM

  • Push verification such as Okta Verify Push is more effective than OTP against traditional phishing.
  • However, for stronger resistance, use FIDO-based factors such as U2F, Windows Hello, or WebAuthn.

Screen Shot 2020-04-19 at 9.48.11 PM.png

Factor Enrollment

  • MFA Policy allows to use which 2FA (optional, required) to be used for which groups?
    • It is always best practice to never change a Default Policy in Okta.
    • This is strongly encouraged mainly because the Default Policy will always be the least restrictive, ensuring that you, as an admin, will never lock yourself out of your Okta org.
  • MFA Rule allows to exclude users, allow particular conditions like IP address (N/w Zones) and then allow this particular factor for 1st time enrollment or every time.
  • Okta Sign-On Policy controls the manner in which a user is allowed to sign on to Okta, including whether they are challenged for MFA and how long they are allowed to remain signed in before re-authenticating.
  • Application Sign-On Policy can configure MFA at the application level. By adding MFA to an app, you provide an additional layer of security for specific apps.

SSO Categories

  • SAML/ WS-Fed
  • OIDC
  • SWA
    • Uses Okta browser plugins to securely pass credentials into web forms on behalf of the authenticated Okta User.
    • SWA was created by Okta to provide single sign-on for applications that don’t support proprietary federated sign-on methods or SAML.
    • SWA applications are typically used to connect consumer type applications such as LinkedIn or Facebook.

Screen Shot 2020-04-05 at 9.16.40 PM

Provisioning

Possible via:

  1. Agent-based Provisioning 
    • Active Directory
    • LDAP
    • On Premise Provisioning
      • This agent is generic and intended to connect to any On-Prem resource that can use the SCIM.
    • This is a good, robust strategy. However, it is limited to on-premise applications.
  2. API-based Provisioning
    • The Okta app connector has been integrated to the Application’s Provisioning API
    • Typically these provisioning APIs are proprietary and they use the REST model
    • Available functionality from the CRUD feature set is limited to functions published in the App’s API.  For example, some API’s may allow Create and Update, but not Deprovision.
    • This is the best, most extensible strategy.
  3. SAML JIT
    • This permits Creation and Updates of downstream users.
    • This does NOT permit Imports or Deprovisioning of downstream users
    • The triggering action is a user’s login attempt, rather than the Administrator’s assignment action.
      • This limits the flexibility of the timing of account creation, as accounts in the application cannot be created proactively using this method.
      • This strategy can be leveraged to feed the Okta Profile itself via UD mappings.
      • Common use case is for customers who have an existing, on-prem IDP (ADFS, for example) but still want to use cloud apps through Okta.
        • Configured using SAML JIT with an Inbound SAML connection
    • This is the most restricted, inflexible strategy. However, it can be a lifesaver for the initial deployment of legacy applications.

On-Premises Provisioning

  • Okta provisioning agent polls Okta for any provisioning event.

Screen Shot 2020-04-14 at 5.28.46 PM

Screen Shot 2020-04-14 at 5.41.25 PM

Office 365

With regards to Office 365, Okta can do 2 very important things for you:

  1. Federate (or SSO) using WS-Federation.
  2. Provision users, a key part of the User Lifecycle Management flow (uses SCIM).

Provisioning Types

  1. Profile Sync, also referred to as Version 0, synchronizes only the basic profile in Office 365, consisting of 5 attributes.
    • In order to create an Okta/ Cloud-mastered user, you are required to provide 4 attributes: last name, first name, username, and primary email address.
    • Profile Sync synchronizes these 4 required attributes, plus the Display Name attribute. When using Profile Sync, Okta is leveraging the SOAP APIs.
  2. User Sync, also referred to as Version 1, will synchronize a user’s full profile in Office 365, consisting of over 20 attributes.
    • This is the option which is used for provisioning any Okta/ Cloud-mastered accounts you have.
    • When deploying User Sync, Okta leverages the Graph APIs and is also able to assign licenses and roles to users.
  3. Universal Sync, also referred to as Version 2, should be selected when you have Active Directory-mastered users.
    • This provisioning option will synchronize over a 100+ user attributes in Office 365 + directory objects, such as Contact and Distribution Lists.
  4. License or Role Management only, this option will be used by those organizations that will be using the Microsoft Provisioning option Azure Active Directory Connect or Azure AD Connect.
    • This will be the case for organizations which use an on-premise Exchange Server creating a “hybrid” or “write-back” scenario.
    • Note that Microsoft provisioning cannot assign licenses to users so, in such a scenario, you can use Microsoft provisioning to provision your Directory-mastered users and then leverage Okta’s license/role management provisioning to assign Office 365 licenses to users.

Notes

  • User Sync and Universal Sync cannot be used with DirSync, Azure Active Directory Sync, or Azure Active Directory Connect.
  • Once you select User Sync or Universal Sync, you can not modify your selection back to Profile Sync.

Requirements

  1. Register your company’s public domain with your Office365 tenant. This is true for all implementations. True for all implementations:
  2. Set default domain correctly. True for all implementations.
  3. Prepare your directory for either: Microsoft provisioning or Okta provisioning. And, if you’re using Okta provisioning, which Okta provisioning option is best for you?
    1. Microsoft Provisioning: Cloud Exchange server write back to On-premise server.
      • Screen Shot 2020-04-12 at 9.53.47 AM
    2. Okta Provisioning
      1. User Sync : Okta-mastered users (cloud-based users) -> O365
        • Configure the Office 365 application in Okta and select the Okta provisioning option of User Sync.
        • This option will sync the Okta user’s full profile, consisting of 20+ attributes depending on whether you have added any custom attributes, into Office 365.
        • Screen Shot 2020-04-12 at 10.02.20 AM
      2. Universal Sync: Active Directory-mastered users -> O365.
        • Configure the Office 365 app in your Okta org and select Universal Sync as the Okta Provisioning option.
        • Transform data within Okta to meet Office 365 requirements—such as username and email address—and verify that your company domain name is correct.
        • Execute a PowerShell script to turn on federation (SSO) at which point all users will be required to authenticate through Okta to access Office 365.Screen Shot 2020-04-12 at 9.54.58 AM
NOTE: First, use SWA sign-in option (just a place-holder) to make sure provisioning users works and then enable WS-Fed sign-in option (if provisioning works). This is because, provisioning of user must happen first in O365 and then user federation, because federation requires user present in O365.
Data Transformation
  1. Email Address: Change in Profile Mappings of Universal Directory using Expression language.
    • Eg: alice@oktaice.com (source.mail) -> alice@oktaedu007.com (Mail)
    • String.substringBefore(source.email, “@”) + oktaedu007.com
    • Preview function is available.
  2. Application Username
    • Change in O365 App -> Sign-on tab -> CREDENTIAL DETAILS -> Application Username Format -> Custom using Expression language.
    • Same example as above.

Test Provisioning

  • Test users are synced from Okta-> O365.

Setting Up WS-Federation

  • Configure WS-Federation yourself using PowerShell or let Okta configure WS-Federation automatically (provide O365 Admin cred).

Best Practices

  1. A default domain cannot be federated.
  2. Create an In-Cloud Global Administrator account in Office 365 so that you can always have a side door into your tenant.
  3. Create an Okta-Mastered Super Administrator in Okta to set up the Microsoft Office 365 integration.
  4. Verify that your Okta provisioning is working before you federate to ensure that you have your accounts set up properly inside Office 365.
  5. Never delete the Office 365 app in Okta once federated.
  6. For O365 use either: SWA and WS-Fed authentication type.
  7. Okta removes the domain federation in the following cases:
    • If you switch from WS-Fed to SWA
    • If you delete the app instance

Continue reading “Okta Workforce Identity”

Okta Technical Concepts

Mastery

  • Mastery defines flow and maintenance of user object attributes.
  • When a profile is mastered from a given resource (application or directory), the profile attributes of an Okta user is derived exclusively from that resource and therefore cannot be edited in Okta.
  • For example, an AD mastered Okta user has an Okta profile. However, that profile is not editable in Okta by the user or administrator and derives its information exclusively from Active Directory.

Profile Mastering

  • Mastering defines the flow and maintenance of user-object attributes and their lifecycle state.
  • Mastering is a more sophisticated version of read (import) users.
  • If the lifecycle state of the user in AD moves to Disabled, the linked Okta user also switches to the corresponding lifecycle state of Deactivated on the next the read (import).

Profile Master

  • A profile master is an application (usually a directory service such as AD, or HRMS such as Workday) that acts as a source of truth for user profile attributes.
  • A user can only be mastered by a single application or directory at any one time.

ALM

  • Users can be mastered by attribute -> attribute-level mastery (ALM).
  • ALM delivers finer grain control over how profiles are mastered by allowing admins to specify different profile masters for individual attributes.
  • Profile mastering only applies to Okta user profiles, not app user profiles.

Okta Lifecycle Management

LCM -> Provisioning/ deprovisioning of user accounts across systems/ resources.

User Management

  1. Manually create user profiles
  2. Import user profiles from a directory or app
    1. AD integration
    2. LDAP integration
    3. Application integration
    4. JIT provisioning
      1. AD Delegated Authentication
      2. Desktop SSO
      3. Inbound SAML.
  3. Import users from a CSV file

Provisioning and Deprovisioning

Provisioning Methods:

  1. Agent-based Provisioning (good, robust strategy)
    • Active Directory
    • LDAP
    • On Premise Provisioning (SCIM)

Security at Okta

Tenant data encrypted using Symmetric-Key -> stored in Key-Store -> accessed by Master-Key.

  • Okta encrypts the tenant confidential data in the database.
  • The encryption is performed using symmetric encryption 256-bit AES with Tenant exclusive symmetric keys
    • Tenant symmetric keys ensures data segregation.
  • Tenant symmetric keys are stored in a tenant-exclusive keystore.
    • The keystore can be accessed only with a tenant-exclusive master key.
  • Tenant-exclusive master keys are encrypted in three different ways to achieve the highest level of availability and business continuity.
    • The use of application level encryption protects sensitive data, even in the event of partial compromise.
    • As a result, attackers lack the ability to decrypt the data if armed with 2 out of 3 of the following: Master Key, Key Store, and/or the user’s app context.

Identity Providers

Types:

  • Social Identity Provider
  • Inbound SAML
  • Smart Card/ PIV Card
    • x.509 compliant digital certificate-based with certificate chains.

Screen Shot 2020-04-19 at 11.05.29 PM.png

SAML issue debugging

  • Login URL or Single Sign On URL -> ACS URL
  • Audience Restriction/ Entity ID -> same as ACS URL
  • The username format expected by the application (i.e. email address, first initial last name, etc)
  • User experiences an endless loop of being redirected to Okta, and then back to the application’s normal login page (which can’t handle SAML assertions), then back to Okta, etc.
    • This can occur when the application utilizes a custom or “vanity” domain (ie https://MyCompany.my.salesforce.com) that is configured to redirect authentication to the IdP.

Provide these to SP:

  1. Identity Provider Single Sign-On URL. The SP may refer to this as the “SSO URL” or “SAML Endpoint.” It’s the only actual URL Okta provides when configuring a SAML application, so it’s safe to say that any field on the Service Provider side that is expecting a URL will need this entered into it.
  2. Identity Provider Issuer. This is often referred to as the Entity ID or simply “Issuer.” The assertion will contain this information, and the SP will use it as verification.
  3. x.509 Certificate. Some service providers allow you to upload this as file, whereas others require you paste it as text into a field.

Account Linking

  • Users can use multiple IdPs to sign in, and Okta can link all of those IdP profiles to a single Okta user. This is called Account Linking.
  • If, for example, a user signs in to your app using a different Identity Provider than they used for registration, Account Linking can establish that the user owns both identities, allowing the user to sign in from either account.

Apps

  • Apps can be:
    • Okta Verified
    • Community Verified
    • Community Created
  • App Integration Wizard (AIW): If you want to add an integration that doesn’t already exist in the Okta Integration Network (OIN), use the App Integration Wizard (AIW) to create a new integration and connect Okta with SAMLOIDCSWA, or SCIM application.
    • After you create your integration, assign it to users in org.
    • Integration is private, visible only within own Okta org.
    • Submit an app integration for OIN to make it public.
    • AIW following tabs:
      • General
      • Sign-on Policy
      • Provisioning (optional)
      • Import
      • Assignment

Authentication Types:

  • Web App: SWA, SAML 2.0, OIDC
  • Native App: None, SAML 2.0, OIDC (PKCE and Client Credentials: secrets is placed in the app; insecure)
  • Single Page App: OIDC (PKCE)
  • OAuth Service: OAuth 2.0

OIDC: If SPA app, decide what kind of app visibility and login flow you want.

  • Configure  app to be initiated only in the background without an Okta app button, (or)
  • Configure the sign-in request to be initiated either by the app or by Okta.

SCIM provisioning: SWA or SAML 2.0 as authentication types supported and NOT  OIDC is not supported.

  • Basic Auth  — To authenticate using Basic Auth mode, provide the username/ password for the account that handles the create, update, and deprovisioning actions on SCIM server.
  • HTTP Header  — To authenticate using HTTP Header, provide a bearer token that will provide authorization against SCIM app.
  • OAuth2  — To authenticate using OAuth2, provide the access token and authorization endpoints for SCIM server, along with a client ID and a client secret.

Application User Profile object

  • Application User profiles are app-specific, but may be customized by the Profile Editor in the administrator UI.
  • SSO apps don’t support a user profile; because not much required in user-profile.
  • Provisioning apps supports a user profile.
  • Any profile properties visible in the administrator UI for an application assignment can also be assigned via the API.
  • Some properties are reference properties and imported from the target application and only allow specific values to be configured.

External ID

  • Users in Okta are linked to a user in a target application via an externalId.
  • Okta anchors a user with his or her externalId during an import or provisioning synchronization event.
  • Okta uses the native app-specific identifier or primary key for the user as the externalId.
  • The externalId is selected during import when the user is confirmed (reconciled) or during provisioning when the user has been successfully created in the target application.

SCIM

  • System for Cross-domain Identity Management specification, is an open standard designed to manage user identity information.
  • SCIM provides a defined schema for representing users and groups, and a RESTful API to run CRUD operations on those user and group resources.
  • The provisioning actions performed by an integration are described using the database operation acronym “CRUD“: Create, Read, Update, and Delete.
    • Along with this Okta supports Syncing Passwords and Profile Attribute Mappings via SCIM.
    • For audit purposes, Okta users are never deleted. Instead, they are deprovisioned by setting an active attribute to “false”. Then, if an existing user needs to be re-provisioned at a later date.

Screen shot of the SCIM Connection panel, highlighting the supported provisioning actions: import new users and profile updates, push new users, push profile updates, and push groups.

Apps SignOn-Modes

Screen Shot 2020-04-19 at 7.44.41 PM.png

SWA Apps

Template Apps Types:

  1. Template App: does a POST of username/ password to a sign-in page.
  2. Template Plugin App: uses a plugin to search the username/ password fields in page to POST username/ password.
  3. Template App 3 Fields – Similar to above; use if an app page also has an additional field such as Company ID
  4. Template 2 Page Plugin App – Also similar to the Template Plugin App; use if sign in flow is spread over two separate pages
  5. Template Basic Auth App – Use if the app supports HTTP basic authentication.

Plugin SWA application: A SWA application that requires a browser plugin.

SWA application (no plugin): A SWA application that uses HTTP POST and doesn’t require a browser plugin.

When you configure your sign-in options, you can set up SWA so that:

  • User sets username and password
  • Administrator sets username and password
  • Administrator sets username, user sets password
  • Administrator sets username, password is the same as user’s Okta password
  • Users share a single username and password set by administrator

Screen Shot 2020-04-19 at 8.05.23 PM.png

Browser Plugins

When end users start an app from their Okta End User Dashboard,

  • A new browser tab opens to the app’s URL.
  • The plugin uses an encrypted SSL connection to obtain authentication information and other required information from Okta.
  • Then applies that information to the page.
  • The plugin does not store their credentials after authentication is complete.

To enhance security, the plugin only works with trusted and verified sites. If end users have not installed the Okta Browser Plugin but have one or more applications on their end-user dashboard that require it, they see a notification to install the plugin on the dashboard.

Okta Browser Plugin provides the following functionality:

  1. Automatically sign in to apps
  2. Automatically initiate an Okta sign-in
  3. Automatically fill in credentials on sign-in pages (if user have not enabled automatic app sign-in)
  4. Automatically insert passwords on password-update pages
  5. Update passwords
  6. Quickly jump to Admin Console
  7. Switch between multiple Okta accounts

Simulating an IdP-initiated Flow with the Bookmark App

  • When an application only supports an SP-initiated flow, you can simulate an IDP-initiated flow with the Bookmark app.
  • With the Bookmark application, the end user clicks a chiclet in Okta and is signed into the application.
  • Internally, the chiclet calls -> Bookmark app with application URL that goes to the domain in the app, -> then calls Okta.
  • You can customize the chiclet for Bookmark to display the logo for the application with the SP-initiated flow, so the end user experience is not different from logging on to any other application.
  • NoteProvisioning features are not supported by Bookmark apps.

Pass Device Context to SAML apps using Limited Access

  • Limited Access allows you configure Okta to pass device context to certain SAML apps through the SAML assertion during app authentication.
  • The app can then use that information to limit access to certain app-specific behaviors, such as user permissions to edit the app or download files from the app.
  • Supported attribute values
Attribute Value in SAML Attribute Statement Definition
TRUSTED User’s device is trusted as defined by the Okta app sign-on policy
NOT_TRUSTED User’s device is untrusted as defined by the Okta app sign-on policy
UNKNOWN The device context is unknown because one or both of the following is true:

  • Device Trust is not enabled for the given device type (Security > Device Trust)
  • Device Trust is not configured in the app’s sign on policy (Applications > app > Sign On > Sign On Policy)
  • Use the Okta Expression Language to transform the value as needed for your use case.
  • For example, to map Okta terms for a trusted device context to relevant Salesforce terms, you would enter this statement in the Value field:
    • device.trusted == “TRUSTED” ? “HIGH ASSURANCE” : “STANDARD”
  • The above statement transforms terms as follows:
Okta device context attribute Salesforce term
TRUSTED HIGH ASSURANCE
NOT_TRUSTED STANDARD
UNKNOWN STANDARD

<org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance&#8221; xsi:type=”xs:string”>NOT_TRUSTED

Configure Expressions

  1. source refers to the object on the left hand side:
    • Can be used in either Okta to App or App to Okta mappings.
    • Example: source.firstName
  2. user refers to the Okta user profile.
    • Can only be used in the Okta to App mapping, where it returns the same result as source
    • Example: user.firstName
  3. appUser (implicit reference) refers to the in-context app (not Okta user profile):
    • Can only be used in the Okta to App mapping, where it returns the same result as source.
    • Example: appUser.firstName
  4. appUserName (explicit reference) refers to a specific app by name:
    • Can be used in either Okta to App or App to Okta mappings.
    • Used to reference an app outside the mappings.
    • Example: google.nameGivenName
    • If multiple instances of an app are configured, additional app user profiles that follow the first instance are appended with an underscore and a random string.
      • Example: google, google_, google_.

Selective Profile Push (uni-directional)

Along with mapping, the selective profile push feature allows admins to select which attributes are pushed from Okta to an app when a provisioning event occurs. While mapping may be bi-directional, selective profile push is uni-directional, meaning that this data can only be pushed from Okta to a target app.

  • Apply mapping on user create and update: This pushes data when a user is created and also when there is a change in their profile.
  • Apply mapping on user create only: This pushes data only when a new user is created, and does not automatically push data when a user profile changes.
  • Do Not map: This removes an existing mapping.

Hide sensitive attributes

  • Okta allows you to mark an attribute in the Okta user profile as sensitive, which ensures that no one in Okta can view the information stored in that attribute field.
  • No Okta admins or end users will have access to any data marked sensitive.
  • Only Okta Super admins can mark an attribute as sensitive and use sensitive attributes in mapping attributes or SAML assertions.
    • Passing sensitive data from one application to another through Okta.
      • For example, if you wanted to pass along a user’s employee number from a downstream app to an upstream app but do not want that information visible to Okta users or admins, you can mark that attribute as sensitive.
      • If the employee number is stored in AD, you would map the AD attribute to the appropriate field in the Okta user profile and mark the Okta attribute as sensitive.
      • Then map the Okta attribute to the upstream app (for example, Workday). The data will flow from AD through Okta to Workday.
    • Including sensitive attributes in SAML 2.0 app attribute statements.
      • You can also use these sensitive attributes in SAML assertions to provide extra validation when an end user is signing in to an app.

Incremental Imports

incremental-importsa.png

AD Integration

Supported AD integration features

  • Out-of-scope OU refers to an OU that does not appear in or is not selected in the relevant OU selector.

Delegated Authentication

When Okta is integrated with an AD instance:

    • DelAuth, JIT and ADProfile Mastering enabled by default.
  • The user enters their username and password in the Okta end user home page.
  • The username and password are transmitted over the SSL connection implemented during setup to an Okta-AD agent running behind a firewall.
  • The Okta AD agent passes the user credentials to the AD domain controller for authentication.
  • The AD domain controller validates the username and password and uses the Okta AD agent to return a yes or no response to Okta.
  • A yes response confirms the user’s identity and they are authenticated and sent to their Okta homepage.

Delegated authentication maintains persistence for directory authenticated – DelAuth sessions and AD is ultimate source for credential validation.

  • As AD is responsible for authenticating users, changes to a user’s status (such as password changes or deactivations) are immediately pushed to Okta.
  • If disable AD as profile master, changes made in AD are not pushed to Okta.

Lifecycle Settings:

  • Define what happens when a user is deactivated in AD.
  • Users can be deactivated, suspended, or remain an active user in Okta.
  • Only the highest priority profile master for an Okta user can deactivate or suspend an Okta user.
  • To verify the highest priority profile master, review the Profile Masters page.

Screen Shot 2020-04-24 at 12.50.24 PM.png

Synchronize Passwords

Sync Password: Okta -> AD

  • Enable Sync Password from the Okta Admin Console Provisioning page.
    • With Okta to AD synchronization, the Okta password is pushed to AD.
      • Useful if Okta to be the final authentication resource and need to use AD instance to authenticate access to legacy resources that you can’t connect to Okta.
    • To allow Okta to synchronize with AD, DelAuth must be OFF.
      • AD agent needs additional permissions to write the new password to AD.
    • All password changes should be initiated in Okta and propagated to AD and users should be prohibited from changing their passwords directly in AD.

Okta Password Sync agent: AD -> Okta

  • Install on all integrated domain controllers in domain.
    • When DelAuth to AD is enabled, directory passwords are not synchronized to Okta because DelAuth performs the authentication and there is no Okta password.
    • With DelAuth, users use their directory password to sign on to Okta.
    • Change password in AD.
    • This is useful to access DSSO scenario where user uses same AD password to access cloud apps via DSSO.

Screen Shot 2020-04-24 at 1.48.42 PM.png

Application Password synchronization

(Okta -> Apps)

Okta uses standard APIs to synchronize passwords with cloud and on-premises applications when they are available.

Google Suite, Salesforce, and Atlassian JIRA, can use Okta to create and assign passwords when a user first accesses the application.

  • Sync Password is enabled,
    • Default behavior is to sync the existing Okta password.
    • If AD/ LDAP DelAuth configured, sync AD or LDAP password.
      • Okta uses the application API to synchronize the AD or LDAP password to the application.
      • The password is stored as the application password.
  • Options:
    1. Sync Okta passwords or random passwords to provisioning enabled apps.
    2. Sync AD/ LDAP passwords and to provisioning enabled apps.

Mobile Password Synchronization

  • With Okta to mobile synchronization, the password is synchronized to the application client on the mobile device.
  • This functionality is only available for iOS and Android native mail clients that are configured with Okta Mobility Management (OMM).

Import safeguards

  • Okta enables you to set a safeguard against an unusual number of app un-assignments during user import.
  • An import safeguard is the maximum percentage of app users in an org that can be unassigned while Okta still allows the import to proceed.
  • Apply an import safeguard either at the app level, org level, or both.
    • Org-level applies to all app user assignments in an org.
    • App-level applies to individual app user assignments in an app.

Take the example of an org with 10 apps and 100 users assigned to each app, for a total of 1,000 app users:

  • Org-Level Safeguard – Applied against the total app user population of 1,000 app users.
    • If set at 20 percent, any import that would unassign more than 200 app users would be paused.
  • App-Level Safeguard – Applied against the population of users assigned to any given app (100 users, in this example).
    • If set at 50 percent, any import that would unassign more than 50 users from a given app would be paused.

By default, the app-level and org-level safeguards are enabled and set at 20% each.

Match imported users

  • When import users, set up Okta rules to match any attribute that is currently mapped from an AppUser profile to an OktaUser profile.
  • This helps you sync identities across systems and determine whether an imported user is new or user profile already exists in Okta.

LDAP Integration

Supported LDAP integration features

JIT Authentication: Ability to authenticate user credentials through LDAP for access into Okta, and update group memberships and profile information before access.

  • JIT Authentication = Authentication + Update profile + Update group membership 

LDAP incremental imports: Okta only supports time stamp-based change tracking.

  • To identify changes made since the last import, the agent uses modifyTimestamp.
  • If directory supports modifyTimestamp, incremental imports work.

Schema Discovery: Only add attributes to the directory profile if they are already in the directory, so Okta first does a schema discovery step to populate the attribute picker.

  • For Okta to discover the attribute, it must be added to an object within the User object hierarchy in the directory: user, parent (eg: top) or auxiliary objects.

LDAP Interface

LDAP Interface allows cloud-based LDAP authentication against Okta’s UD instead of an on-premises LDAP server or Active Directory.

LDAP Interface connect LDAP applications to UD without installing and maintaining Okta LDAP agents:

  • LDAP agent is usually deployed inside your firewall.
  • LDAP interface is managed in the cloud.

The LDAP Interface lets you use Okta to centralize and manage your LDAP policies, users, and applications that support LDAP authentication protocol.

  • The LDAP Interface is a cloud proxy that consumes LDAP commands and translates them to Okta API calls, providing a straightforward path to authenticate legacy LDAP apps in the cloud.
  • To enhance security, add MFA to LDAP apps with Okta Verify Push and One-Time-Password (OTP).
  • The Okta LDAP agent synchronizes user profiles to or from an existing LDAP directory.
  • The LDAP interface lets you migrate certain applications from LDAP or AD servers to Okta.

Use MFA with LDAP Interface

  • The format for entering your password and MFA token is: <password,MFAtoken>
    • For example, enter the following for Okta Verify: password,123456

Screen Shot 2020-04-27 at 5.23.18 PM.png

CSV directory integration

CSV directory integration is a lightweight out-of-the-box option that lets you build custom integrations for on-premises systems using the On-Premises Provisioning agent (OPP).

  1. All active users must be present for every CSV import.
    • CSV file is a representation of all active users from your source system.
    • All active users must be present in every CSV import or the user is considered inactive.
    • Okta uses the Unique Identifier that you designated as the primary identifier for each user.
  2. The import behaves as follows:
    • If a user is missing during the latest import, Okta assumes the user is no longer active and deactivates the user in Okta.
    • If a new user appears who did not exist in Okta during a previous import as denoted by their unique identifier, then Okta creates the user.
    • If a user is present in Okta and was present in the latest import, Okta treats the current data in the CSV file as the source of truth, and executes any updates to that user’s attributes.
    • OPP also lets you use additional provisioning functionality such as:
      • profile push
      • password push
      • group push
      • user import
      • group import
      • user deactivation

Groups

Group Type Description
Native Okta groups The default group Everyone contains all users in Okta org.
Active Directory groups Active Directory (AD) is the most common source for groups.

Use the Okta AD agent integrate your AD instance with Okta.

No support of Domain Local Groups containing members from multiple domains.

Support Universal Security Groups with cross-domain membership if there is a two-way trust established between the domains.

No support USG cross-forest membership.

LDAP groups Use the Okta LDAP agent to import groups from LDAP-compliant Windows and Unix servers.
Application groups Some applications have groups that can import into Okta. The ability to import these groups depends on whether or not the groups can be accessed through an API and if the application has been integrated with Okta.

  • Box
  • Google Apps
  • Jira
  • Microsoft Office 365
  • Workday

Screen Shot 2020-04-19 at 10.41.56 PM.png

  • App-Group deletion:
    • Groups that were imported from an application cannot be deleted in Okta.
    • However, you can use the Import feature to remove them.
    • Open the app instance and delete the group there.
    • The deleted group will be removed from Okta during the next scheduled (or manual) import.

Universal Security Group (USG) & Domain Local Groups

In AD, a universal security group (USG) allows for membership across all trusted forests in an AD environment. By default, USGs only exist in Okta if there is an AD agent in a domain importing users and groups. Enabling the USG option ignores domain boundaries when importing group memberships for users. This assumes that the relevant domains are connected in Okta.

Only groups from connected domains are imported as USG object in Okta:

  • Deploy an AD agent for every domain in forest that contains the USG object that want to sync with Okta.
  • Support USG with cross-domain membership if there is a two-way trust established between the domains.
  • Each connected domain then imports its groups.
  • When a user’s group memberships match any groups that were imported, Okta syncs the memberships for the user to each group.
  • This option provides greater control of group imports from on-premises apps to Okta.
  • USG do not support cross-forest membership.

Okta does not support Domain Local Groups:

  • Okta does not support Domain Local Groups containing members from multiple domains.

Nested groups

Many directory systems and applications support the concept of nested groups (or groups in groups). Okta does not currently support nested groups.

  • Okta imports all nested directories for group members and adds the user to each group in Okta.
  • In the example below, the group in AD (left) has two child groups. The resultant group in Okta (right) lists members without nested groups.

Group04.png

Group Push & Enhanced Group Push

Group Push enables to take existing Okta groups and their memberships, and push them to provisioning-enabled, third-party applications.

  • These memberships are then mastered by Okta.

  • Limitation
    • Using the same Okta group for assignments and for group push is not currently supported.
    • To maintain consistent group membership between Okta and the downstream app, you need to create a separate group that is configured to push groups to the target app.

Enhanced Group Push enables you to push from Okta to existing groups in specific apps.

  • Can push group with a same name that already exists in the third-party apps.
  • These memberships are then mastered by Okta.
  • Enhanced Group Push is available for these integrations:
    • Active Directory, Adobe CQ, Box, DocuSign, Dropbox for Business, G Suite, Jira , Jira On-Prem, Jive Software, Litmos, Org2Org, ServiceNow UD, Slack, Smartsheet, Workplace by Facebook, Zendesk.

App-Level MFA

Configure app-level MFA by itself or both org-level MFA and app-level MFA together.

Edit Rule screen

MFA for groups and users

MFA can be applied to specific groups and users:

  • All apps are supported except for Microsoft clients that use active mode authentication.
  • Microsoft Office 365 is supported; outdated Microsoft Office thick clients are not supported.

Windows Credential Provider for MFA

  • Users can use the Okta Credential Provider for Windows to prompt users for MFA when signing in to supported Windows servers with an RDP client.

Custom IdP Factor Authentication

IdP factor authentication allows admins to enable a custom SAML-MFA factor based on a configured Identity Provider.
  • Once an IdP factor has been enabled and added to a factor enrollment policy, users who sign in to Okta may use it to verify their identity at sign in. 
  • End users are directed to the Identity Provider in order to authenticate and then redirected to Okta once verification is successful.
  • With this feature you can:
    • Add a custom IdP factor for existing IdP authentication.
    • Enable or disable the custom factor from the admin console.
    • Link an existing SAML 2.0 Identity Provider to use as the custom factor provider.

null

Custom TOTP Factor

Custom TOTP Factor allows admins to enroll users in a custom TOTP factor by importing a seed into Okta and authenticating users with the imported hardware token.

  • Successful factor enrollment requires passing a profile ID and shared secret for each token.

Factor Sequencing

Allows an end user to sign in to their org by authenticating with a series of configured MFA factors in place of a standard password.
  • Verify that the factors in at least one factor chain is marked as Required for enrollment.
  • For example, by defining the following two factor sequences in your sign on policy:
    • a. SMS and Okta Verify
    • b. Okta Verify and Security Questions
  • Your end users are required to enroll in the sequenced factors (a) or (b) for successful authentication to take place.

null

  • Allows SSO without pre-assignment of Apps to users.
  • Access is managed only by sign-on policy and the authorization rules of each app.
  • This mode can improve import performance and can be especially helpful for larger-scale orgs that manage many users and/or apps.
  • Only for custom SAML Wizard and OIDC apps, not for those available in the OIN.
  • Supports SP-initiated flows ONLY, hence, no chiclets option in Okta Homepage
  • Hidden tabs may include Provisioning, Imports, Group Push, and Mobile in AIW.
  • Assignments tab of an app no longer lists user assignments, as it is no longer subject to individual assignments.
  • No audit reporting.
  • No provisioning.
  • Disable this mode after implementation, can restore previously user assignments.

Network Zone

  • A network zone is a security perimeter used to limit or restrict access to a network based on a single IP address, one or more IP address ranges, or a list of geolocations.
  • Network zones may be incorporated into:
    • Policies
    • Application sign-on rules
    • VPN Notifications
    • Integrated Windows Authentication (IWA).

Zone Types

  1. IP Zones: A type of zone where the network perimeter is defined by IP addresses.
  2. Dynamic Zones: A type of zone where the network perimeter is defined by one or more of the following:
  • ASN (Autonomous System Numbers) are used to uniquely identify network on the Internet.
    • ISPs can apply to obtain one or multiple ASNs assigned to them.
    • While an ISP name can change, their assigned ASN is reserved and immutable.
  • IP Types setting checks and determines if a client uses a proxy and the type of proxy.
    • AnyIgnores all proxy types. If selected, at least one of the following must be defined:
      • Locations
      • ISP ASNs
    • Any proxy: Considers clients that use a Tor anonymizer proxy or a non-Tor anonymizer proxy type.
    • Tor anonymizer proxy: Considers clients that use a Tor anonymizer proxy.
    • Not Tor anonymizer proxy: Considers clients that use non-Tor proxy types.

Screen Shot 2020-04-20 at 1.10.29 AM.png

{
  "type": "IP",
  "id": "nzouagptWUz5DlLfM0g3",
  "name": "newNetworkZone",
  "status": "ACTIVE",
  "created": "2017-01-24T19:52:34.000Z",
  "lastUpdated": "2017-01-24T19:52:34.000Z",
  "system": false,
  "gateways": [
    {
      "type": "CIDR",
      "value": "1.2.3.4/24"
    },
    {
      "type": "RANGE",
      "value": "2.3.4.5-2.3.4.15"
    }
  ],
  "proxies": [
    {
      "type": "CIDR",
      "value": "2.2.3.4/24"
    },
    {
      "type": "RANGE",
      "value": "3.3.4.5-3.3.4.15"
    }
  ]
}

Screen Shot 2020-04-20 at 1.15.25 AM.png

{
    "type": "DYNAMIC",
    "id": "nzowc1U5Jh5xuAK0o0g3",
    "name": "test",
    "status": "ACTIVE",
    "created": "2019-05-17T18:44:31.000Z",
    "lastUpdated": "2019-05-21T13:50:49.000Z",
    "system": false,
    "locations": [{
        "country": "AX",
        "region": null
    },
    {
        "country": "AF",
        "region": "AF-BGL"
    }],
    "proxyType": "Any",
    "asns": ["23457"]
}

Behavior Security Detection

  • Deciding when to require a second MFA factor is a common challenge for admins.
  • Prompting user to authenticate with an additional factor for every sign-in attempt can frustrate the users; however, having a less restrictive policy increases the risk of account compromise.
  • With this feature, admins can configure the system so that individual end users are only prompted for an additional MFA factor when change in user-behavior that the admin defines.

There are two components of security behavior detection:

  1. Define the behavior to track.
  2. Define an action to take if there is a change in trackable behavior for an end user.
Examples of trackable behaviors
  • Sign in from a new country, state, or city
  • Sign in from a new location more than a specified distance from previous successful sign ins
  • Sign in from a new device
  • Sign in from a new IP address
  • Sign in from a location deemed unfeasible for a user to travel to across two successive sign-in attempts.
Examples of actions to take
  • Permit access
  • Require the end user to validate with an additional MFA
  • Set the session lifetime

Trusted App Flows

nullnull

System Log lists behavior that is evaluated for any sign-in attempts.

  • For user.session.start and policy.evaluate.sign_on events.
  • Possible values:
    • POSITIVE: Behavior is detected. POSITIVE results in the policy rule matching – if MFA is configured for the rule, Okta prompts for MFA.
    • NEGATIVE: Behavior is not detected. NEGATIVE results in the policy rule not matching – if MFA is configured for the rule, Okta does not prompt for MFA.
    • UNKNOWN: Not enough history to detect behavior. UNKNOWN results in the policy rule matching – if MFA is configured for the rule, Okta prompts for MFA.
    • BAD_REQUEST: Not enough information from the sign-in attempt to detect behavior. For example, if the cookies and device fingerprint are missing, Okta treats it as a BAD_REQUEST, which results in the policy rule matching – if MFA is configured for the rule, Okta prompts for MFA.

Risk Scoring

  • Based on a machine-learning risk engine, that determines the likelihood of an anomalous sign-in event.
  • First set to high risk for new users initially — over time the risk level is low as more information is gathered.
  • Risk Scoring is designed to complement automation detection and does not account for the following scenarios:
    • Substitute for bot management or automation detection
    • Replacement for Web Application Firewalls (WAFs)
    • Assistance with any type of security compliance
Browser flow for new devices
  1. An Okta security image cookie in the browser.
  2. A browser fingerprint collected by the Okta Sign-in widget.

Enable strong MFA factors in factor enrollment policies

  • Security Questions: Do not use as a second factor.
  • SMS/Email/Voice: Avoid using as a second factor.
  • Okta Verify with Push Notifications: Enable if available for org, If not available, use Okta Verify or Google Authenticator.
  • WebAuthn factors (includes Yubikey devices): Enable if available for your org.
  • Okta also recommends blacklisting any IPs that are identified as a Tor Anonymizer Proxy.

Events & System Log API

  • Events API’s (/api/v1/events) representation is the Event object.
    • action.objectType attribute denotes the event type.
  • System Log API’s (/api/v1/logs) representation is the LogEvent object.
    • More structure and a much richer set of data elements than Event.
    • eventType attribute represents the event type.
    • LogEvent’s legacyEventType attribute identifies the equivalent Event action.objectType value.
  • Event Type Mapping section of this guide provides a static mapping of Events API event types to System Log API event types.
  • System Log API expands and enriches the data model to support storing message values as atomic, independent attributes.

Inline Hooks

  • Inline hooks are synchronous calls to external service.

Hook Call Steps Diagram

Hook Request and Response

  • JSON payload of response external service sends can contain a commands object, in which service send commands to Okta that affect the course of the Okta process flow.
    • The commands available vary depending on the type of Inline Hook.
  • Header-based authentication with customer issued API Key.
  • Default timeout of 3 seconds.
    • Okta will attempt at most one retry.
  • A request is not retried if the customer endpoint returns a 4xx HTTP error code.
  • Any 2xx code is considered successful and not retried.
  • No redirects from external service endpoint.
  • JSON Payload Objects to Okta by external service:
    • commands
      • commands to Okta to affect the process flow being executed and to modify values within Okta objects.
    • error
      • return error messages to Okta.
    • debugContext
      • specify additional information to make available in the System Log in connection with the call to your hook.
  • 10 max inline hooks.

Screen Shot 2020-04-19 at 11.15.19 PM.png

User Import Inline Hook

Request Objects: The outbound call from Okta to external service will include the following objects in its JSON payload:

  1. data.appUser.profile: Provides the name-value pairs of the attributes contained in the app user profile of the user who is being imported.
  2. data.user: Provides information on the user profile currently set to be used for the user who is being imported.
    • data.user.profile contains the name-value pairs of the attributes in the user profile. If the user has been matched to an existing Okta user, a data.user.id object will be included, containing the unique identifier of the Okta user profile.
  3. data.action.result:The current default action that Okta will take in the case of the user being imported. The two possible values are:
    • CREATE_USER: A new Okta user profile will be created for the user.
    • LINK_USER: The user will be treated as a match for the existing Okta user identified by the value of data.user.id.
  4. data.context: This object contains a number of sub-objects, each of which provides some type of contextual information. You cannot affect these objects by means of the commands you return. The following sub-objects are included:
    • data.context.conflicts: List of user profile attributes that are in conflict.
    • data.context.application: Details of the app from which the user is being imported.
    • data.context.job: Details of the import job being run.
    • data.context.matches: List of Okta users currently matched to the app user based on import matching. There can be more than one match.
    • data.context.policy: List of any Policies that apply to the import matching.

Response Objects

Screen Shot 2020-04-20 at 7.43.40 AM.png

Sample JSON Payload of Request from Okta to External Service

{
   "source":"cal7eyxOsnb20oWbZ0g4",
   "eventId":"JUGOUiYZTaKPmH6db0nDag",
   "eventTime":"2019-02-27T20:59:04.000Z",
   "eventTypeVersion":"1.0",
   "cloudEventVersion":"0.1",
   "eventType":"com.okta.import.transform",
   "contentType":"application/json",

   "data":{
      "context":{
         "conflicts":[
            "login"
         ],
         "application":{
            "name":"test_app",
            "id":"0oa7ey7aLRuBvcYUD0g4",
            "label":"app7ey6eU5coTOO5v0g4",
            "status":"ACTIVE"
         },
         "job":{
            "id":"ij17ez2AWtMZRfCZ60g4",
            "type":"import:users"
         },
         "matches":[
         ],
         "policy":[
            "EMAIL",
            "FIRST_AND_LAST_NAME"
         ]
      },

      "action":{
         "result":"CREATE_USER"
      },

      "appUser":{
         "profile":{
            "firstName":"Sally2",
            "lastName":"Admin2",
            "mobilePhone":null,
            "accountType":"PRO",
            "secondEmail":null,
            "failProvisioning":null,
            "failDeprovisioning":null,
            "externalId":"user221",
            "groups":[
               "everyone@clouditude.net",
               "tech@clouditude.net"
            ],
            "userName":"administrator2",
            "email":"sally.admin@clouditude.net"
         }
      },

      "user":{
         "profile":{
            "lastName":"Admin2",
            "zipCode":null,
            "city":null,
            "secondEmail":null,
            "postAddress":null,
            "login":"sally.admin@clouditude.net",
            "firstName":"Sally2",
            "primaryPhone":null,
            "mobilePhone":null,
            "streetAddress":null,
            "countryCode":null,
            "typeId":null,
            "state":null,
            "email":"sally.admin@clouditude.net"
         }
      }
   }
}

Sample JSON Payloads of Responses from External Service to Okta

{
  "commands": [{
    "type": "com.okta.action.update",
    "value": {
      "result": "LINK_USER"
    }
  }, {
    "type": "com.okta.user.update",
    "value": {
      "id": "00garwpuyxHaWOkdV0g4"
    }
  }]
}

{
  "commands": [{
    "type": "com.okta.user.profile.update",
    "value": {
      "firstName": "Stan"
    }
  }]
}

{
  "commands": [{
    "type": "com.okta.user.profile.update",
    "value": {
      "firstName": "Stan"
    }
  }]
}

{
  "commands": [{
    "type": "com.okta.appUser.profile.update",
    "value": {
      "firstName": "Stan",
      "lastName": "Lee"
    }
  }]
}

Event Hooks

  • Event hooks are outbound calls from Okta that trigger process flows within your own software systems.
  • They are sent when specific events occur in your org, and they deliver information about the event.
  • Unlike inline hooks, event hooks are asynchronous and do not offer a way to execute the Okta process flow.
  • After sending the call, the Okta process flow continues without waiting for a response from the called service.
  • Implement a web service with an Internet-accessible endpoint.
  • Concept of webhooks.
  • Before the introduction of Event Hooks, polling the System Log API was the only method your external software systems could use to detect the occurrence of specific events in your Okta org.
  • Event Hooks provide an Okta-initiated push notification.
  • Event Types
  • Steps:
    • Add an event hook
    • Verify external endpoint

Access Request Workflow

  • A complete, multi-step approval workflow through which end users can request access to apps.
  • Admins can designate approvers to grant users access for self-service applications.
  • The Access Request Workflow feature allows business application owners — rather than IT — to grant users access to apps and assign entitlements in apps that require them.
  • Use Access Request Workflow to:
    • Designate group or individual approvers
    • Create customize notifications
    • Add comments and notes
    • Configure customizable timeout rules

Automations

  • Similar to batch job.
  • Enable to quickly prepare and respond to situations that occur during the lifecycle of end users who are assigned to an Okta group.
  • For example, automation sends inactivity lockouts notification in advance to end-users to advance.
    • If a user has been inactive for a set number of days and is on the verge of being locked out, automation can alert the inactive user in advance.
  • Conditions — The criteria that triggers Okta to perform actions upon a group of end users. Conditions can be scheduled to run once or to recur daily. The following conditions are currently available:
    • User inactivity in Okta
    • User password expiration in Okta
  • Actions — The actions that Okta to perform when the scheduled conditions are true. The following actions are currently available:
    • Send email to the user

Screen Shot 2020-04-28 at 10.44.43 PM.png

Workflows

  • Similar to Zapier (automated integration workflows)
  • Workflows is interface-driven no-code-required design console that facilitates the implementation of automated business processes, especially for identity-related use cases.
  • A key component of the product is integration with a wide range of third-party apps and functions.
  • A Workflow, or Flow, is a sequence of steps that represent the events, logic, and actions in a use case.

With Okta Workflows, you can:

  • Provision and deprovision app accounts: Use Case: Provision and Deprovision.
  • Sequence actions with logic and timing. Okta Workflows can create deactivated accounts in all apps one week before a new employee’s start date, and then activate them on their first day. When an employee leaves your company, Okta Workflows can remove access to all apps except payroll, and then remove that access after one year. Use Case: Change time- and context-based identity entitlement.
  • Resolve identity creation conflicts. Okta Workflows can host logic that detects username conflicts and generates unique usernames. See Use Case: Resolve identity conflicts.
  • Respond to security incidents. When Okta detects a suspicious sign-in, Okta Workflows can notify your security team through PagerDuty or create a ticket in ServiceNow. See Use Case: Send notifications for lifecycle activity.
  • Log and share lifecycle events. Okta Workflows can query Okta APIs and syslog events, run logic, and compile data into a CSV file. Then, Okta Workflows can email that file to your teams. See Use Case: Report lifecycle activity.

Flow cards

  • Event – What has to happen for your Flow to begin? The first card in any Flow is an Event.
  • Action – What should happen if the application event occurs? Action cards instruct your Flow to send commands to applications.
  • Function – Use cases aren’t always linear. How should action cards account for different scenarios? Function cards let you act on the data from a card or branch into another logical flow.

How does it work?

  • Connector – Which applications are involved in your Flow? Connectors enable you to interact with them without setting up APIs.
  • Connection – A connection is unique level of access to the connected app (for example, admin or end user).
  • Input – Input fields determine how an action or function card proceeds. For example, the input field of the Search for User action card above is User Details.
  • Output – Output fields contain the results that are generated by the event, action, or function card. In the example Flow above, the User Unassigned from Application event card produces output values like Date and TimeMessageEvent IDEvent Type, and Event Time.
  • Mapping – The movement of data between cards is referred to as mapping. To map data between cards, drag and drop the output field of one card to the input field of another card. Be sure that the format of the fields match (text, number, true/false, date & time, object, or list).

Key rotation

Org Authorization Server

  • Configure and perform key rollover/rotation at the client level (if you need to use own key).
  • For security purposes, Okta automatically rotates keys used to sign the ID token (for Okta APIs, Okta owns the key).
    • Can’t manually rotate the Org Authorization Server’s signing keys.
  • Okta doesn’t expose the public keys used to sign the access token minted by the Org Authorization Server. (for Okta APIs, Okta owns the key).
    • Use the /introspect endpoint to validate the access token.
  • Pin that specific client to a specific key by generating a key credential and updating the application to use it for signing.

Custom Authorization Servers

  • Configure and perform key rollover/rotation at the Authorization Server level.
  • For security purposes, Okta automatically rotates keys used to sign tokens.
    • In case of an emergency, Okta can rotate keys as needed.
  • Okta always publishes keys to jwks_uri.
  • To save the network round trip, app should cache the jwks_uri response locally.
    • HTTP caching headers are used and should be respected.
  • You can switch the Authorization Server key rotation mode by updating the Authorization Server’s rotationMode property (manual or automatic).
  • Pin that specific client to a specific key by generating a key credential and updating the application to use it for signing.
    • This overrides the Custom AS rollover/pinning behavior for that client.

Authentication API vs OAuth 2.0 vs OpenID Connect

There are three major kinds of authentication that you can perform with Okta:

  • Authentication API controls access to Okta org and applications.
    • Provides operations to user authentication, MFA ops, recovery (forgotten passwords, unlock accounts).
    • Use this to work with Okta API and control user access to Okta.
  • OAuth 2.0 protocol controls authorization to access a protected resource, like web app, native app, or API service.
    • Use this in controlling access to your own application.
  • OpenID Connect protocol is built on the OAuth 2.0 protocol and helps authenticate users and convey information about them.
    • Use this in controlling access to your own application.

OAuth

Screen Shot 2020-04-18 at 5.03.16 PMScreen Shot 2020-04-18 at 5.03.45 PM

Okta Organizations

  • An Okta organization (org) is a root object and a container for all other Okta objects.
  • Contains resources such as users, groups, and applications, as well as policy and configurations for Okta environment.
  • Mandatory items: users and applications.
    • Users can be created in Okta, imported through directory integrations, or imported through application integrations.
    • Applications are connections to public apps (such as Office 365) or proprietary applications (such as your own apps).

Administrators

Screen Shot 2020-05-06 at 11.03.05 AM

  • Administrator can be assigned to Users or Groups.
  • Role Types are Super Admin, Org Admins, ..
  • Permissions are individual capability that Administrator can do.
  • Permission are categorized as follows:
    1. Org-wide Settings
    2. Okta Sign-On
    3. User Management
    4. Group Management
    5. Application Management
    6. MFA
    7. Mobile Policies
    8. Mobile Devices
    9. Event and Inline Hooks
    10. OpenID Connect End-to-End Scenario
    11. API Tokens
    12. OMM – Applications
    13. OMM – Wifi (EA)
  • Super admin is the only role that can manage users or groups with admin privileges.
  • Group admins lose their permissions if their group is assigned admin privileges.
  • Admin roles can’t be assigned to groups with more than 5,000 members.
  • Group rules don’t work with admin groups.
    • This prevents delegated admins from erroneously increasing their or other user’s administrative privileges.
  • Admins lose their permissions when they are deactivated.
    • If you reactivate a former admin, you also need to reassign privileges to them.

Role Target

  • Define permissions for admin roles into a smaller subset of Groups or Apps within org.
  • Targets limit an admin’s permissions to a targeted area of the org.
  • Define admin roles to target Groups, Applications, and Application Instances.
    1. Group targets
      • Grant an admin permission to manage only a specified Group.
      • For example, an admin role may be assigned to manage only the IT Group.
    2. App targets
      • Grant an admin permission to manage all instances of specified Apps. Target Apps are Okta catalog Apps.
      • For example, there can be multiple configurations of an Okta catalog App, such as Salesforce or Facebook. When you add a Salesforce or Facebook App as a target, that grants the admin permission to manage all instances of those Apps and create new instances of them.
    3. App Instance targets
      • Grant an admin permission to manage an instance of an App or instances of multiple Apps. App Instances are specific Apps that admins have created in their org.
      • For example, there may be a Salesforce App configured differently for each sales region of a company. When you create an App Instance target, an admin may be assigned to manage only two instances of the configured Salesforce Apps and perhaps assigned to manage an instance of another configured App, such as Workday.

 Group membership admin role Vs. Group admin role

Group membership admin role: Grants permission to view all users in an org and manage the membership of groups.

  • The group membership admin role can be a standalone assignment for admins who need to add and remove existing in a group, or it can be combined with a role like group admin or help desk admin for broader user management permissions.

Group admin role: While this role performs mainly user-related tasks (create users, deactivate users, reset passwords), it can also be used restrict these tasks to a select group or groups of Okta users. In essence, you can “delegate” permissions to a particular admin to manage a specific group.

  • Uses for this role might be a franchise, where each location needs to silo and control their location-specific teams.
  • Each franchise would need to create and manage their own data without affecting or being affecting by the others.

Third-party admins

  • The parent organization may create a custom portal for administrator functions, so that third-party admins do not even see the Okta interface.
    • One use case for this feature is if Support services are outsourced to third parties. These third parties act on behalf of the organization managing users needs but do not actually see the Okta user interface.
    • Another use case is in a B2B2C scenario where a “hub-and-spoke model” is set up with end customers (external users) created as admins so they can run their own Okta org, often with a custom portal created with Okta APIs so these external users aren’t aware that they are admins in Okta.
  • In both these scenarios, because the external users are given admin roles within Okta, they receive the default Okta admin emails: welcome emails, admin email alerts and Okta customer notifications.
  • By introducing the concept of a third-party admin in Okta, we are able to treat these admins differently than the typical Okta admins who interact directly with the Okta Admin Console.

API Token

  • API token are generated with the permissions of the user that created the token.
    • If a user’s permissions changes, then so does that of the token.
  • Okta recommends generating API tokens from a service account with permissions that do not change.
  • API tokens are valid for 30 days and automatically renew every time they are used with an API request.
    • When a token has been inactive for more than 30 days it is revoked and cannot be used again.
    • Tokens are also only valid if the user who created the token is active.
    • If the user is reactivated, the API token is accepted
  • Okta Agents uses API tokens during installation which they use to access org.
    • While these tokens are similar to the standard API token, they are managed by Okta.
    • Agent tokens are usually managed when you activate, deactivate, or reactivate an agent.
    • Some agents such as the Okta AD Agent automatically revoke their tokens when agent deactivates.
  • The following color codes are used to show the token status.
    • Green – the token has been used within last three days.
    • Gray – the token has not used between last three days, and today is at least seven days before its expiration date.
    • Red – the token is within seven days of expiring.
    • Yellow – the token is suspicious.
      • A suspicious token is associated with an agent that is not registered in Okta.
      • Normal agent deployments do not create suspicious tokens.

Trusted Origins

  • All cross-origin web requests and redirects from Okta to organization’s websites must be explicitly whitelisted.
  • Use the Trusted Origins tab to grant access to websites that you control and trust to access Okta org through the Okta API.
  • An Origin is a security-based concept that combines the URI scheme, hostname, and port number of a page.

Rate Limiting

Requests that hit the rate limits return a “429 Too Many Requests” HTTP status code.

4 types:

  1. Org-wide rate limits or Default Rate Limits
    • Limits the # of API requests within any given second.
    • Basically for CRUD of /api/v1/…
  2. Concurrent Rate Limits
    • Limits the # of ACTIVE requests/ LIVE threads at any given time.
    • For concurrent rate limits, traffic is measured in three different areas:
      • Agent traffic
      • Microsoft Office 365 traffic,
      • All other traffic, including API requests.
    • Counts in one area are not included in counts for the other two areas.
      • For agent traffic, Okta measured each org’s traffic and set the limit at above the highest usage in the last four weeks.
      • For Office 365 traffic, the limit is 75 concurrent transactions per org.
      • For all other traffic including API requests, the limit is 75 concurrent transactions per org.
    • The first request to exceed the concurrent limit returns an HTTP 429 error, and the first error every sixty seconds is written to the log. Reporting concurrent rate limits once a minute keeps log volume manageable.
  3. End-User Rate Limit
    • # of requests from Okta user interface to 40 requests/ user / 10 sec / API endpoint.
  4. Okta-Generated Email Message Rate Limits that vary by email type.
    • Okta enforces rate limits on the number of Okta-generated email messages that are sent to customers and customer users.
    • For example, if the number of emails sent to a given user exceeds the per-minute limit for a given email type, subsequent emails of that type are dropped for that user until that minute elapses.

DynamicScale Rate Limits

  • Example: MLB Baseball games.
  • If your needs exceed Okta’s default rate limits for the base product subscriptions (e.g. One App, Enterprise or IT Products) you’ve already purchased, you can purchase a “DynamicScale” add-on service which will grant you higher rate limits for the endpoints listed below.
  • DynamicScale will increase your default rate limits by 3x to 1000x, depending on the tier multiplier you purchase.
  • Customers can purchase this add-on annually for a Production tenant or temporarily for testing in a Sandbox tenant.
HTTP/1.1 200 OK
X-Rate-Limit-Limit: 10000
X-Rate-Limit-Remaining: 9999
X-Rate-Limit-Reset: 1516307596

Policy

  • Policies contains “settings
  • Rules contains “actions” or “conditions”

Screen Shot 2020-04-19 at 11.39.44 PM.png

Group Password Policies

  • Allows to define password policies and associated rules to enforce password settings on the group and authentication-provider level.
  • Group Password Policies are enforced only for Okta and AD-mastered users.
    • For AD-mastered users, ensure that AD password policies don’t conflict with the Okta policies.
    • Passwords for AD-mastered users are still managed by the directory service.
  • Some applications, such as Office 365 and Google G Suite, check an Okta password policy when provisioning a user to ensure that the Okta policy meets application’s password requirements.

Password Policy Types

"actions": {
    "signon": {
      "access": "ALLOW",
      "requireFactor": true,
      "factorPromptMode": "SESSION",
      "rememberDeviceByDefault": false,
      "factorLifetime": 15,
      "session": {
        "usePersistentCookie": false,
        "maxSessionIdleMinutes": 120,
        "maxSessionLifetimeMinutes": 0
      }
    }
  }

Conditions object

  • Conditions object(s) specify the conditions that must be met during policy evaluation in order to apply the rule in question.
  • All policy conditions, as well as conditions for at least one rule must be met in order to apply the settings specified in the policy and the associated rule. Policies and rules may contain different conditions depending on the policy type.
  • Conditions object work against following objects:
    • People Condition object
      • User Condition object
      • Group Condition object
      • "people": {
            "users": {
              "exclude": [
                "00uo7dIiN4jizvY6q0g3"
              ]
            },
            "groups": {
              "include": [
                "00go6lU1wxnmPisNp0g3"
              ]
            }
          }
    • AuthContext Condition object  (Any or RADIUS)
    • Network Condition object
    • Authentication Provider Condition object
    • User Identifier Condition object
    • Patterns object (Matches/ EL)
    • Application/ App Instance Condition object
    • Platform Condition object

Add Sign On policies for applications

Okta’s Client Access Policies (CAPs)

  • Allow to manage access to enterprise apps based on the client type & device platform.
  • CAPs evaluate information included in the User-Agent request header sent from the users’ browser.
  • Because the User-Agent can be spoofed by a malicious actor, you should consider using a whitelist approach when create CAPs and require MFA or Device Trust as described in the following best practices:
    • Implement a whitelist consisting of one or more rules that specify the client type(s) + device platform(s) + trust posture combinations that will be allowed to access the app.
    • Require Device Trust or MFA for app access.
    • Include a final catch-all rule that denies access to anything that does not match any of the CAPs preceding rules.

IdP Discovery Policy

  • IdP Discovery Policy determines where to route users when they are attempting to log in to org.
  • Users can be routed to a variety of identity providers (SAML2IWAAgentlessDSSOX509FACEBOOKGOOGLELINKEDINMICROSOFTOIDC) based on multiple conditions listed below.
  • All Okta orgs contain one and only one IdP Discovery Policy in immutable default rule routing to org’s login page.

WebFinger

WebFinger interface is to allow a client application to determine the Identity Provider that a given username (or identifier) should be routed to based on your organization’s Identity Provider Routing Rules (IdP Discovery Policy).

Screen Shot 2020-04-20 at 9.03.58 AM.png

Request Example

curl -v -X GET \
-H "Accept: application/jrd+json" \
-H "Content-Type: application/json" \
"https://${yourOktaDomain}/.well-known/webfinger?resource=okta:acct:joe.stormtrooper%40example.com"

Response Example

In this example, there is a rule configured that has a user identifier condition which says that users with the domain example.com should be routed to a configured SAML IdP:

{
    "subject": "okta:acct:joe.stormtrooper@example.com",
    "links": [
        {
            "rel": "okta:idp",
            "href": "https://${yourOktaDomain}/sso/idps/0oas562BigqDJl70T0g3",
            "titles": {
                "und": "MySamlIdp"
            },
            "properties": {
                "okta:idp:metadata": "https://${yourOktaDomain}/api/v1/idps/0oas562BigqDJl70T0g3/metadata.xml",
                "okta:idp:type": "SAML2",
                "okta:idp:id": "0oas562BigqDJl70T0g3"
            }
        }
    ]
}

Request Example

curl -v -X GET \
-H "Accept: application/jrd+json" \
-H "Content-Type: application/json" \
"https://${yourOktaDomain}/.well-known/webfinger?resource=okta:acct:joe.stormtrooper%example.com&rel=http%3A%2F%2Fopenid.net%2Fspecs%2Fconnect%2F1.0%2Fissuer"

Response Example

In this example, there is already a rule configured that has a user identifier condition which says that users with the domain example.com should be routed to a configured SAML IdP. However, we supplied a rel parameter of http://openid.net/specs/connect/1.0/issuer that limited the result to Okta:

{
    "subject": "okta:acct:joe.stormtrooper@example.com",
    "links": [
        {
            "rel": "https://openid.net/specs/connect/1.0/issuer",
            "href": "https://${yourOktaDomain}/sso/idps/OKTA",
            "titles": {
                "und": "{subdomain}"
            },
            "properties": {
                "okta:idp:type": "OKTA"
            }
        }
    ]
}
  • A session token is a one-time bearer token that provides proof of authentication and may be redeemed for an interactive SSO session in a user agent.
  • Session tokens can only be used once to establish a session for a user and are revoked when the token expires.
  • Okta provides a very rich Authentication API to validate a user’s primary credentials and secondary MFA factor.

Screen Shot 2020-04-20 at 12.37.38 AM.png

Users States

 

StatusCauseStagedAccounts become staged when they are first created, before the activation flow is initiated or if there is some administration action pending.Pending user actionAccounts that are pending user action have been provisioned, but the end user has not provided verification by clicking through the activation email or providing a password.ActiveAccounts become active when

Password ResetAccounts are in a password reset state when

  • The account requires a password to be set for the first time.
  • End users request a password reset or you initiate one on their behalf.

Note: Password Reset is considered an Active state.

Locked outAccounts become locked out when they exceed the configured number of login attempts which are set when you configure password policies.

Note: Locked out is considered an Active state.

SuspendedAccounts are in this state when you explicitly Suspend and unsuspend users.

Note: Suspended is considered an Inactive state.

DeactivatedAccounts are deactivated when you explicitly Activate and deactivate users.

Note: Deactivated is considered an Inactive state.

STAGED, PROVISIONED, ACTIVE, RECOVERY, LOCKED_OUT, PASSWORD_EXPIRED, or DEPROVISIONED

Administrator Dashboard

Screen Shot 2020-04-26 at 9.05.04 AM.png

Tasks Page

ThreatInsight

  • ThreatInsight aggregates data across the Okta customer base and uses this data to detect malicious IP addresses that attempt credential-based attacks.
  • Note: The Okta Model for threat detection is based on a continuous evaluation of the latest data available.
  • If Okta incorrectly identifies any trusted IP addresses as suspicious, you may exempt them from ThreatInsight as needed.
  • Refer to Exempt Zones for more details.
null
null

RADIUS Integration

Okta Device Trust Solutions

  • Devices
  • Okta Device Trust contextual access management solutions enable organizations to protect their sensitive corporate resources by allowing only end users with managed devices to access Okta-integrated applications.

Screen Shot 2020-05-02 at 2.32.48 PM.png

WiFi Profiles and Policies

  • This allows end users to join an established WiFi network without having to enter any security information.
  • Allows you to create multiple WiFi profiles and assign them to OMM-enrolled mobile devices so that users are no longer limited to just one WiFi profile per device.
  • Supports the WPA/WPA2 Enterprise protocol to enable the following:
    • Username and password authentication with a RADIUS server (no longer limited to shared key authentication)
    • The option to add one or more server certificates needed to establish a secure connection to the WiFi network
    • The shared key is currently unmasked.
    • Authentication with user credentials is not supported.
    • Certificate-based authentication is not supported.
    • Connection to hidden networks is not supported on Android devices.
    • WiFi Policies feature supports only one WiFi policy on a device at one time. To support multiple WiFi profiles on a device, see Configure a WiFi profile.

AWS Keys Points

  • AWS does not copy launch permissions, user-defined tags, or Amazon S3 bucket permissions from the source AMI to the new AMI.
  • You should generate a password for each user and give these passwords to your system administrators. You should then have each user set up multi factor authentication once they have been able to log in to the console. You cannot use the secret access key and access key id to log in to the AWS console; rather, these credentials are used to call Amazon API’s.
  • Network throughput is the obvious bottleneck. You are not told in this question whether the proxy server is in a public or private subnet. If it is in a public subnet, the proxy server instance size itself may not be large enough to cope with the current network throughput. If the proxy server is in a private subnet, then it must be using a NAT instance or NAT gateway to communicate out to the internet. If it is a NAT instance, this may also be inadequately provisioned in terms of size. You should therefore increase the size of the proxy server and/or the NAT solution.
  • For all new AWS accounts, there is a soft limit of 20 EC2 instances per region. You should submit the limit increase form and retry the template after your limit has been increased.
  • Currently the S3 Classes are; Standard, Standard-Infrequent Access, One Zone-Infrequent Access, Reduced Redundancy Storage and for archive, Glacier & Glacier Deep Archive. Reduced Redundancy Storage is the only S3 Class that does not offer 99.999999999% durability and therefore any of the answers that contain Reduced Redundancy Storage cannot be correct.
  • The valid ways of encrypting data on S3 are
    • Server Side Encryption (SSE)-S3,
    • SSE-C,
    • SSE-KMS or a client library such as Amazon S3 Encryption Client.
    • Both the Oracle and SQL Server database engines have limits to how many databases that can run per instance. Primarily, this is due to the underlying technology being proprietary and requiring specific licensing to operate.
      • The database engines based on Open Source technology such as Aurora, MySQL, MariaDB or PostgreSQL have no such limits. Further information: https://aws.amazon.com/rds/faqs/
    • Security Groups are stateful and updates are applied immediately.
    • To see the process by which federated users are granted access to the AWS console. 
    • The Question describes a situation where low cost OneZone-IA would be perfect. However it also says that there is a high licence cost with each meme generation. The storage savings between IA and OneZone-IA are about $0.0025 this is small compared to the $10 for licensing. Therefore you may well be better to pay for full S3-IA.
    • You cannot tag individual folders within an S3 bucket. If you create an individual user for each staff member, there will be no way to keep their active directory credentials synched when they change their password. You should either create a federation proxy or identity provider and then use AWS security token service to create temporary tokens. You will then need to create the appropriate IAM role for which the users will assume when writing to the S3 bucket. https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html
  • The aim is to direct sessions to the host that will provide the correct language. Geolocation is the best option because it is based on national borders.
    • Geoproximity routing is another option where the decision can be based on distance. While latency-based routing will usually direct the client to the correct host, connectivity issues with the US Regions might direct traffic to AP. In this case, the word “ensure” is operative: users MUST connect to the English-language site.
  • Additional clones of your production environment, ElastiCache, and CloudFront can all help improve your site performance. Changing your autoscaling policies will not help improve performance times as it is much more likely that the performance issue is with the database back end rather than the front end. The Provisioned IOPS would also not help, as the bottleneck is with the memory, not the storage.
  • There are many features which are native to the KMS service. However, of the above, only import your own keys, disable and re-enable keys and define key management roles in IAM are valid. Importing keys into a custom key store and migrating keys from the default key store to a custom key store are not possible. Lastly operating as a private, native HSM is a function of CloudHSM and is not possible directly within KMS. https://aws.amazon.com/kms/faqs/
  • The essence of a stateless installation is that the scalable components are disposable, and configuration is stored away from the disposable components. The best way to solve this type of problem is by elimination. Storage Gateway offers no advantage in this situation. CloudWatch is a reporting tool and will not help. An ELB will distribute load but will not really specific to stateless design. Elasticache is well suited for very short fast cycle data and is very suitable to replace in memory or on disk state data previously held on the web servers. RDS is well suited to structured and long cycle data, and DynamoDB is well suited for unstructured and medium cycle data. Both can be used for certain types of stateful data either in partner with or instead of Elasticache. 
  • An Elastic Load Balancer can help you deliver stateful services, but not stateless. Elastic Map Reduce is a data crunching services and is not related to servicing web traffic.
  • Consolidated Billing is a feature of AWS Organisations. Once enabled and configured, you will receive a bill containing the costs and charges for all of the AWS accounts within the Organisation. Although each of the individual AWS accounts are combined into a single bill, they can still be tracked individually and the cost data can be downloaded in a separate file. Using Consolidated Billing may ultimately reduce the amount you pay, as you may qualify for Volume Discounts. There is no charge for using Consolidated Billing.
  • DynamoDB makes use of parallel processing to achieve predictable performance. You visualise each partition as an independent DB server of fixed size. Each responsible for a defined block of data. In SQL terminology it is called sharding. The documentation is specific about the SSDs, but makes no mention of read-replicas or EBS-Optimised. Caching in-front of DDB is an option (DAX), but it is not inherent to DDB.
  • Spread placement groups have a specific limitation that you can only have a maximum of 7 running instances per Availability Zone and therefore this is the only correct option. Deploying instances in a single Availability Zone is unique to Cluster Placement Groups only and therefore is not correct. The last two remaining options are common to all placement group types and so are not specific to Spread Placement Groups. Spread Placement Groups are recommended for applications that have a small number of critical instances which need to be kept separate from each other. Launching instances in a Spread Placement Group reduces the risk of simultaneous failures that might occur when instances share the same underlying hardware. Spread Placement Groups provide access to distinct hardware, and are therefore suitable for mixing instance types or launching instances over time. In this case, deploying the EC2 instances in a Spread Placement Group is the only correct option. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html
  • If you configure the auto-scaling to maintain 50% per AZ, then if you lose any one AZ the remaining two will carry the full load between them. This does mean that you carry and extra cost, but if the Board has decided that this level of resiliency is needed, that will be the cost.

AWS Advanced

S3

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886468#overview

  • Attributes for objects: Key, Value, VersionID, Metadata, Sub-resources: ACL, Torrent.
  • All buckets are private when created.
  • Supports MFA delete.
  • Read after Write consistency (POST/ PUT of new objects).
  • Eventual consistency for overwrite PUT and DELETE (can take sometime to propagate)
  • Control bucket access using bucket ACL or bucket policy.
  • Buckets can be configured to create access logs.
  • Client side encryption (Encryption-in transit) is through SSL/TLS or send encrypted data.
  • Server side encryption (Encryption-at Rest)
    • S3 Managed Keys: SSE-S3 (AWS fully manages)
    • KMS Managed Keys: SSE-KMS (Customer + AWS manages)
    • Server side encryption with customer provided keys: SSE-C (Customer fully manages)
  • New version of S3 object permission is not inherited, admin needs to give permission.
  • When version enabled file is deleted, it will be assigned as “delete marker” and file is not actually deleted unless specifically delete the “delete marker” file.
  • Cross region replication requires versioning enabled for both source and destination buckets.
    • Region must be unique.
    • Cross region replication happens after enabled and not to old objects.
    • Replication delete is prohibited and “delete marker” will be not replicated.
  • Transfer acceleration tool says how fast upload works when we do  S3 Transfer acceleration.
  • Lifecycle policy can be applied with versions as well.

CloudFront

  • Edge locations are used for both READ and WRITE ops.
  • Objects are cached with TTL.
  • Explicit invalidation of cache is possible, but will be charged.
    • invalidate file or object or whole bucket (/*) is possible.
  • Origin: origin where files are get to CloudFront (EC2, S3, ELB, Route53).
  • Distribution: collection of edge locations.
    • Web Distribution: static (.html, .css, .php) and dynamic (live streaming) contents  over HTTP/S. Supported origin is S3 or EC2 first and after creating add more origin.
    • RTMP Distribution: streaming media files using Adobe Flash Media Servers’ RTMP protocol (helps to stream video before complete download). Supported origin is S3 only.
  • Signed url/ cookies:

Storage Gateway

  • Supports both virtual or physical gateway to backup data from on-prem storage solution to S3. Basically does offsite backup to S3.
  • 3 Types:
    1. File Gateway (NFS & SMB)
      • file storage in S3 as objects accessed via NFS mount point.
      • ownership, permission, timestamp are stored in S3 metadata.
      • once moved, objects are managed native to S3 ops (versions).
      • Screen Shot 2019-09-18 at 7.28.32 AM.png
    2. Volume Gateway (iSCSI)
      • Asynchronously stores HDD as EBS snapshots in S3.
      • Snapshots are incremental and compressed.
        • Stored volume: stores entire data set (from HDD) in GW and backed-up  from on-prem to S3.
        • Screen Shot 2019-09-18 at 7.31.10 AM.png
        • Cached volume: stores frequently used data in GW and backed-up from on-prem to S3.
        • Screen Shot 2019-09-18 at 7.34.15 AM.png
    3.  Tape Gateway
      • To archive data to S3.
      • VTL (Virtual Tape Library) interface helps to backup from virtual tape.
      • Screen Shot 2019-09-18 at 7.36.21 AM.png

EC2

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886556#overview

Screen Shot 2019-09-18 at 8.00.31 AM.png

Screen Shot 2019-09-18 at 8.01.40 AM.png

  • FIGHT-DRMCP-XZ-AU
  • Root volume can’t be encrypted by DEFAULT AMI, but additional EBS volume can.
    • Can use 3rd party tool bit-locker to encrypt root volume or creating AMI in console or API.
  • Termination protection is turned-OFF by default.
  • When AWS terminate Spot instance, you will not be charged for partial usage. But if you terminate, you will be charged for any hr in which instance ran.
  • Any SG rule changes, update reflects immediately.
  • All inbound traffic is blocked by default.
  • All outbound traffic is allowed by default.
  • SG is stateful, meaning whenever inbound rules are added, corresponding outbound rules are added by default. No blocking allowed.
  • NACL is stateless meaning, explicit inbound and outbound rules have to defined and explicit blocking is allowed.
  • Have any # of EC2 instances within SG.
  • More than one SG groups can be added to EC2.
  • SG only allows allow rules but not deny rules.
  • Removing Autoscaling Group will remove all EC2 instances.
  • EBS volume must be same AZ as EC2.
    • When modify EBS volume (size or any storage type change), we needs to run Linux cmd to see the modified size in EC2 and performance of EC2 affected due change (no stop instance is required).
    • With EBS Snapshots (exist in S3), we can move EBS volume between AZs and region.
      • Create snapshots -> turn to AMI -> launch EC2 in other AZ (based on subnet).
      • Copy AMI from one region (source) to other region (destination) and spin new EC2 from that AMI.
    • Snapshots are point in time copies of volume.
    • Snapshots are incremental (only changed block will be added to S3, not whole volume).
    • On EC2 termination, other than root EBS volume persists.
    • Root volume snapshot can be taken during EC2 runs (perf get affected) or stop EC2 and then take snapshots.
    • You can create AMI from both volume and snapshot.
  • Screen Shot 2019-09-18 at 8.21.24 AM.png
  • AMI can be selected based on
    • Region
    • OS
    • Architecture (32-bit or 64-bit)
    • Launch permission
    • Root device type
      • Instance Store (EPHEMERAL storage; volume created from template stored in S3)
        • It cannot be stopped (if hypervisor failed, you lose data)
      • EBS backed volumes (volume created from EBS snapshots)
        • It can be stopped.
      • Reboot both, data persist.
      • By default, both ROOT volumes will be deleted on terminations, but can be persisted with option to persist data.
    • Encrypted Root Device Volumes:
      • Earlier, cannot provision EC2 with encrypted root volume.
        • Process involved to achieve that:
          • provision EC2 with unencrypted root volume
          • take snapshot
          • create a copy of that snapshot with encrypt option enabled
          • create AMI from encrypted snapshots
          • launch EC2 from that AMI, we will get Encrypted Root Device Volume.
      • Now, can provision EC2 with encrypted root volume.
      • Encrypted Root Device Volume cannot be changed to unencrypted.
      • Sharing of snapshots is possible when it’s unencrypted.
  • EC2 Instance Metadata
  • Placement Groups: Grouping of EC2 instances.
    • PG name must be unique within AWS acct.
    • Only certain instance types supports (Compute, Memory , Storage Optimized, GPU).
    • AWS recommends to have homo-instance within Clustered PG.
    • Can’t merge the PGs.
    • Can’t move the existing instance to PG. Create AMI from your existing instance, then launch new instance from AMI int aPG.
    • 3 types:
      • Clustered PG: Within single AZ. Good for low N/W latency, high throughput. Only certain instances can be launched in this group. Cannot span multiple AZs.
      • Spread PG: placed on distinct underlying H/W. Good for small # of critical instances that should be separate from each other. (separate racks within AZs, to avoid failure of instance/ service when H/W fails). Can span multiple AZs.
      • Partitioned PG: AWS divides each group into logical segments called “partitions”. Good for HDFS, HBase, Cassandra apps. Each partitions within PG has its own set of racks. Each racks has its own n/w and power supply. No two partition within PG share the same racks, allowing to isolate h/w failure within applications. Can span multiple AZs.Screen Shot 2019-09-18 at 11.45.54 AM.png

CloudWatch

  • Monitors performance of AWS services and customer’s services runs on AWS.
  • Monitors events every 5 mins and can turn 1 min detailed monitoring (additional charges).
  • CloudWatch Alarm can trigger notification.
    • Compute
      • Services:
        • EC2
        • Autoscaling groups
        • ELB
        • Route53 Health Checks
      • Metrics:
        • CPU
        • N/W
        • Disk
        • Status Check (hypervisor and underlying infra).
        • Cost
    • Storage and Content Delivery
      • Services:
        • EBS volumes
        • Storage Gateways
        • CloudFront

CLI

  • Access key and secret ID of user will be generated and viewed once for programmatic access via CLI. If you lose, then make the current Access key and secret ID inactive and create one.
  • There is a hidden dir called “.aws” (after running aws configure cmd) which stores the Access key and secret ID of user. That’s the bad practice of accessing services in AWS.
    • Instead define Role to EC2 instance and attach to it and remove whenever you need.

Screen Shot 2019-09-18 at 10.55.25 AM.png

EFS

  • EBS volume cannot be shared with more than one EC2, but EFS can share (since its common) and its easy to be elastic than EBS (it has performance impact when elasticity implemented).
  • Works on NFS4 protocol.
  • Pay for storage and scale petabytes.
  • Support 1000s of concurrent connections.
  • Span within region in multiple AZs.
  • Read after write consistency.
  • Lifecycle Management is supported thru different EFS class (EFS-IA), similar to S3.
  • Mount the EFS on required instance and share it, we can see the changes on all EC2’s.

Databases

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886690#overview

  • RDS has 2 key features:
    • Multi-AZ:
      • Primary and secondary DBs in each AZ with DNS redirection to secondary whenever primary fails (fail-over is automatic).
      • For disaster recovery purpose and does synchronous replication.
      • Supports SQL server, Oracle, MySQL, PostgreSQL, MariaDB.
      • Force fail-over by rebooting RDS instance.
    • Read Replica:
      • Primary and read replica DBs in each AZ.
      • For performance purpose and does asynchronous replication.
      • Primarily for READ ops.
      • Supports Oracle, MySQL, PostgreSQL, MariaDB, Aurora.
      • Must have automatic backup turned-on to deploy read replica.
      • Can have 5 RR copies of any DB.
      • Can have RR of RR (but latency will be issue).
      • Each RR has own DNS endpoint.
      • Can have RR with multi-AZ and multi-AZ source DB.
      • RR can be promoted to own DB with replication breakage.
      • Can have RR in 2nd region (from source DB in 1st region).
  • RDS runs on VM (non-serverless) which AWS manages.
  • Reboot with a failover option makes RDS come-up automatically after reboot.
  • Aurora Serverless is serverless.
  • Whenever delete RDS instance, there is an option to take snapshot or not.
  • Two types of backup:
    • Automated backup:
      • Allow to recover DB at point in time within “retention period” (can be 1 to 35 days).
      • Takes full daily snapshots and store transaction log throughout the day.
      • Point-in-time recovery: when do recovery, AWS will first choose the most recent daily back up and then apply transaction log relevant to that day.
      • Enabled by default and backups stored in S3 (free storage equal to size of DB).
      • Backups takes defined window and that time I/O may be suspended and hence apps delay see latency delay in response.
      • Backups gone when delete the original RDS instance.
    • Database snapshots: 
      • Done manually and exist event after delete the original RDS instance unlike automated backups.
  • When restore backup, it will be a new instance and new DNS endpoint.
  • Encryption at rest is supported with KMS service.
    • All automated backups, read-replica and snapshots are encrypted as well.
    • Supports SQL server, Oracle, MySQL, PostgreSQL, MariaDB, Aurora.
  • DynamoDB
    • Serverless; Stored on SSD
    • Spread 3 geographically distinct DCs.
    • Eventual Consistent Reads (default)
      • Consistency across all copies of data is usually reached within a second after write. (best read performance)
    • Strong Consistent Reads
      • Return results that reflects all writes that received a successful response prior to the read. (best write performance)
  • Redshift
    • Configuration:
      • Single Node (160gb)
      • Multi Node
        • Leader (manages client connections and receives queries)
        • Compute (store data, perform queries and computations; upto 128 nodes).
    • Advanced Compression
      • Columnar data can be compressed much more than row data because similar data stored sequentially on disk.
    • When compare to traditional DB:
      • Uses multiple compression techniques.
      • No index or views usage, so less space.
      • When loading data into an empty table, Redshift automatically samples data and select appropriate compression scheme.
    • Massively Parallel Processing (MPP) can be achieved through # of nodes that can do fast query processing.
    • Backups:
      • Enabled by default with a 1-day retention period with maximum period is 35 days.
      • Maintains 3 copies of data (original, replica on compute nodes and backup in S3).
      • Asynchronous snapshots replication to S3 in another region for DR.
    • Charged for compute nodes hrs and not for leader node hrs, backup and data transfer within VPC (not outside).
    • Encryption-at Rest supported with AES-256 with KMS or own key from CloudHSM.
    • 1-AZ support (no multi-AZ).
    • Can restore snapshots to new AZ when outage.
  • Aurora
    • MySQL flavor RDBMS, 5 times faster than MySQL.
    • Starts with 10GB scales in increments to 64TB (storage autoscaling).
    • Compute resources cans scale upto 32vCPUs and 244GB of memory.
    • 2 copies of data is contained in each AZ with min of 3 AZs, so 6 copies of data.
    • Creates cluster of reader and writer nodes.
    • Withstand 2 copies of data without affecting DB write availability and upto 3 copies without affecting read availability.
    • Storage is self-healing, data blocks and disks are continuously scanned for errors and repaired automatically.
    • 2 types of Aurora Replicas:
      • Aurora Replicas (current 15; fail-over supported)
      • MySQL RR (currently 5)
      • Screen Shot 2019-09-18 at 10.36.01 PM
    • Automated backups are always enabled and doesn’t impact DB performance.
    • Taking snapshots doesn’t impact DB performance.
    • Share snapshots with other AWS accounts.
  • Elasticache

Screen Shot 2019-09-18 at 10.42.25 PM.png

Route53

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886776#overview

  • ELBs don’t have pre-defined IPV4 address; need to resolve them using DNS name.
  • Start of Authority Record (SOA) in DNS contains administrative info/ record such as server name that supplied the data for the zone, administrator name, current version and default # of seconds for TTL file on resource records.
  • Name Server (NS) Records are used by Top Level Domain (TDL) servers to direct traffic to the content DNS server which contains the authoritative DNS records.

Screen Shot 2019-09-19 at 6.27.49 AM.png

  • “A” (Address) Record is a fundamental type of DNS record that maps the domain to IP address.
  • Default TTL is 48 hrs. Less TTL means frequent changes of DNS records.
  • CNAME (Canonical Name) resolves one domain name to another. Eg: http://m.cloudguru.com and mobile.cloudguru.com resolves to same IP address.
  • Alias Records map resources record sets in hosted zone to ELB, CF distribution, S3 bucket that are configured as websites.
    • Work like CNAME: can map one DNS name (www.example.com) to another “target” DNS name (elb123.elb.amazonaws.com).
    • Key difference is CNAME can’t be used for naked domain name (zone apex record, i.e., no “www” in the domain name). Can’t have CNAME for http://acloudguru.com, it must be either an A record or an Alias.
  • Common DNS types:
    • SOA Records
    • NS Records
    • A Records
    • CNAMES
    • MX Records
    • PTR Records
  • Domain name registration take s 3 days to complete, but not always.
  • Health Checks can be set to individual record sets (Sydney, Ohio, Ireland EC2).
    • If record set fails a health checks, it will be removed from Route53 until it passes (checks apps endpoint – .index.html).
    • Also create alarm when fails to send SNS notifications.
  • Routing Policy
    • Simple Routing
    •  Weighted Routing
    • Latency-based Routing
    • Failover Routing
    • Geolocation Routing
    • Geo-proximity Routing
    • Multi-value Answer
  • Simple Routing

Screen Shot 2019-09-19 at 6.52.00 AM.png

Screen Shot 2019-09-19 at 6.54.33 AM

  • Weighted Routing

Screen Shot 2019-09-19 at 6.58.23 AM.png

Screen Shot 2019-09-19 at 7.06.29 AM.png

  • Latency-based Routing
    • Route traffic based on lowest N/W latency for end user (which region has a fastest response time).
    • Need to create latency resource record set for EC2 or ELB resource in each region (Sydney, Ohio, Ireland) where resource located.
    • When Route53 receives query for the resource, it select latency resource record set which has lowest latency and then respond to particular IP address of resource from resource record set.

Screen Shot 2019-09-19 at 7.12.41 AM.png

Screen Shot 2019-09-19 at 7.13.45 AM.png

  • Failover Routing
    • Used in Active (primary) and Passive (secondary) setup.
    • Route53 monitor health of primary site using health check.

Screen Shot 2019-09-19 at 7.17.34 AM.png

Screen Shot 2019-09-19 at 7.18.21 AM.png

Screen Shot 2019-09-19 at 7.18.49 AM.png

  • Geo-location Routing

Screen Shot 2019-09-19 at 11.06.33 AM.png

Screen Shot 2019-09-19 at 11.04.35 AM.png

  • Geo-Proximity (Traffic-Flow only) Routing
    • Route53 route traffic to resources based on geographic location of users and resources.
    • Optionally, choose route more or less traffic to given resource by specifying a value, known as “bias”.
    • A bias expands or shrinks the size of geographic region from which traffic is routed to a resource.
    • To use geo-proximity, must use Route53 Traffic-Flow.

Screen Shot 2019-09-19 at 11.40.12 AM.png

  • Multi-Value Answer Routing
    • Simple Routing Policy + Health check on each record set.

Screen Shot 2019-09-19 at 11.42.51 AM.png

VPC

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886986#overview

  • 1 subnet = 1 AZ (1 subnet cannot span across multiple AZ).
  • Default VPC
    • All subnet in default VPC have a route to internet.
    • Each EC2 has both public and private IP addresses.

Screen Shot 2019-09-20 at 8.24.32 AM.png

  • VPC peering
    • Using private IP address connect one VPC to another.
    • Instance behave as is they were on same private n/w.
    • Peering can be done with same acct VPC and other AWS acct as well.
    • Peering is STAR configuration and not TRANSITIVE.
  • When create a new VPC, following will be created OOB.

Screen Shot 2019-09-20 at 8.26.32 AM.png

  • AZ will be randomized for each AWS acct.
  • The first four IP addresses and the last IP address in each subnet CIDR block are not available for you to use, and cannot be assigned to an instance. For example, in a subnet with CIDR block 10.0.0.0/24, the following five IP addresses are reserved:
    • 10.0.0.0: Network address.
    • 10.0.0.1: Reserved by AWS for the VPC router.
    • 10.0.0.2: Reserved by AWS. The IP address of the DNS server is always the base of the VPC network range plus two; however, we also reserve the base of each subnet range plus two. For VPCs with multiple CIDR blocks, the IP address of the DNS server is located in the primary CIDR. For more information, see Amazon DNS Server.
    • 10.0.0.3: Reserved by AWS for future use.
    • 10.0.0.255: Network broadcast address. We do not support broadcast in a VPC, therefore we reserve this address.

    Screen Shot 2019-09-20 at 8.32.02 AM.png

  • After adding Subnets:

Screen Shot 2019-09-20 at 8.34.52 AM.png

  • Only on IGW can be attached to VPC.
  • Route-Tables
    • Subnets can communicate each other with Route Table via 10.0.0.0: Network address.

Screen Shot 2019-09-20 at 8.37.45 AM.png

  • Default Route-Table is a main Route Table and if an IGW attached to VPC, then any subnets created will be public because it will associate with main Route Table. This is a security concern. To avoid that have:
    • Main Route-Table as private.
      • Screen Shot 2019-09-20 at 9.10.24 AM.png
    • Separate Route-Table as public and allow any traffic to flow through.
      • Screen Shot 2019-09-20 at 9.04.44 AM.png
      • Screen Shot 2019-09-20 at 9.08.09 AM.png
  • Auto-assign IP address option in EC2 instance creation enabled for public subnet and not for private.

Screen Shot 2019-09-20 at 7.18.48 AM.png

  • SSH tunneling can be done using SSH to access a resource without a public IP address via a resource with a public IP address (inside of a VPC).
  • https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13886952#overview
  • NAT-Instances
    • Individual EC2 instances at public subnet.
    • Disable Source/ Destination checks on the instance.
    • There should be route out of the private subnet to the NAT instance, in order for this to work.
    • Autoscaling of NAT-instances can be done with multiple subnets in different AZs and script to automate failure.
    • NAT instances are behind security group.
  • NAT-Gateway
    • Highly available, managed, redundant gateway  inside AZs. (with multiple EC2 instances).
    • Starts at 5Gbps and scales to 45Gbps.
    • No patching required.
    • Not associated with SG.
    • Automatically assigned to public IP address
    • Need to update route tables and no need to disable  Source/ Destination checks.
    • Multiple NAT-GW can be used in different AZ to avoid SPOF.

Screen Shot 2019-09-20 at 1.54.19 PM.png

  • NACL:
    • When create VPC, a default NACL created by default (default NACL).
    • When create subnet, default NACL will be associated.
    • A subnet can have 1 NACL, but 1 NACL can be associated with multiple subnets.
    • Want deny, then have deny rule first.
    • EC2 creates default NACL with InBound and OutBound “rules” allows all traffic.
    • Newly created NACL is everything denied by default (both InBound and OutBound Rule *).
      • Stateless: InBound and OutBound must be defined explicitly (not like SG).
      • Ephemeral ports (1024-65535): short-lived transport port.
    • NACL act first and then SG.
    • Block IP address with NACL and not with SG.
  • ELB:
    • Need at least 2 public subnets to associate ELB.
  • VPC Flow Logs:
    • A feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC.
    • Flow log data can be published to CloudWatch Logs and Amazon S3. After you’ve created a flow log, you can retrieve and view its data in the chosen destination.
    • Cannot tag a flow log.
    • Created at 3 levels:
      • VPC
      • Subnet
      • N/W interface
    • Filter traffic based on Accept, Reject or All.
    • Destination: CloudWatch Logs, S3
      • Destination Log Group: ForCloudWatch Logs, create log group in CloudWatch.
    • IAM Role: create new IAM role with default policy document.
    • VPC peering flow logs only works when VPCs are in same account.
    • Once configure, you cannot change the settings (eg: no change in IAM role).
    • Not all IP traffic is monitored.
      • Traffic generated by instance when they contact the Route53 but, if use own DNS server, those traffic will  be monitored.
      • Traffic generated by Windows instance for Amazon windows license activation.
      • Traffic between 169.254.169.254.
      • DHCP traffic.
      • Traffic to Reserved IP address for the default VPC router.

Screen Shot 2019-09-20 at 6.59.13 PM.png

  • Bastion Host:
    • A special computer on a network specifically designed and configured to withstand attacks and safe other n/w behind it.
    • The host has a special application (proxy server only).
    • Hardened host because of location and purpose, either at DMZ or outside of firewall and has access to untrusted n/ws.
    • Cannot use NAT Gateway as Bastion host.

Screen Shot 2019-09-20 at 8.11.52 PM.png

  • Direct Connect:
    • A cloud service solution that makes it easy to establish a dedicated network connection from on-premises to AWS.
    • Reduces network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.
    • Eg: VPN Connection keeps dropping-off, then use DX.

Screen Shot 2019-09-20 at 8.16.30 PM.png

  • VPC Endpoints:
    • Enables to privately connect VPC to supported AWS services and VPC endpoint services powered by PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
    • Instances in your VPC do not require public IP addresses to communicate with resources in the service.
    • Traffic between your VPC and the other service does not leave the Amazon network.
    • Endpoints are virtual devices.
    • They are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in VPC and services without imposing availability risks or bandwidth constraints on network traffic.
    • Two types:
      • Interface EP
      • Screen Shot 2019-09-20 at 8.21.03 PM.png
      • Gateway EP
        • similar to NAT Gateway.
        • Supports S3 and DynamoDB.Screen Shot 2019-09-20 at 9.33.17 PM.png
        • Screen Shot 2019-09-20 at 9.33.52 PM.png

ELB

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13888022#overview

  • Application and Classic LB comes with DNS name and not with IP address. Network LB won’t get static IP address.
  • Instances are monitored using health-check by ELB (states: InService or OutofService).
  • Classic LB:
    • HTTP 504 Gateway Time-out Error  is due to web server or DB layers has some issues and request failed within idle timeout period.
    • Supports X-Forwaded-For-Header (forward IP address of user) and sticky session features.

Screen Shot 2019-09-21 at 9.43.24 AM.png

  • Target Group is a group consist of targets (types: Instance, IP ranges, Lambda functions) and target can be selected based on RULES you set and route the request based on decision.

Screen Shot 2019-09-21 at 9.59.33 AM.png

Screen Shot 2019-09-21 at 10.01.21 AM.png

Screen Shot 2019-09-21 at 10.04.32 AM.pngScreen Shot 2019-09-21 at 10.04.45 AM.png

Screen Shot 2019-09-21 at 10.05.19 AM.png

Screen Shot 2019-09-21 at 10.06.48 AM.png

Screen Shot 2019-09-21 at 10.07.38 AM.png

Screen Shot 2019-09-21 at 10.08.31 AM.png

  • HA Architecture:

Screen Shot 2019-09-21 at 10.30.15 AM.png

Screen Shot 2019-09-21 at 10.30.26 AM.png

Screen Shot 2019-09-22 at 9.28.06 PM.png

Screen Shot 2019-09-22 at 9.28.48 PM.png

SQS

Screen Shot 2019-09-22 at 10.36.39 PM.png

Screen Shot 2019-09-22 at 10.37.38 PM.png

Screen Shot 2019-09-22 at 10.38.23 PM.png

SWF

Screen Shot 2019-09-22 at 10.41.00 PM.png

Screen Shot 2019-09-22 at 10.41.34 PM.png

SNS

Screen Shot 2019-09-22 at 10.42.40 PM.png

Screen Shot 2019-09-22 at 10.44.21 PM.png

API Gateway

Screen Shot 2019-09-22 at 10.47.16 PM.png

Screen Shot 2019-09-22 at 10.47.57 PM.png

Screen Shot 2019-09-22 at 10.49.06 PM.png

Screen Shot 2019-09-22 at 10.50.01 PM.png

Screen Shot 2019-09-22 at 10.50.16 PM.png

Screen Shot 2019-09-22 at 10.50.59 PM.png

Screen Shot 2019-09-22 at 10.51.40 PM.png

Screen Shot 2019-09-22 at 10.52.07 PM.png

Screen Shot 2019-09-22 at 10.52.47 PM.png

Kinesis

Screen Shot 2019-09-22 at 10.54.41 PM.png

Screen Shot 2019-09-22 at 10.55.05 PM.png

Screen Shot 2019-09-22 at 10.55.53 PM.png

Screen Shot 2019-09-22 at 10.56.58 PM.png

Screen Shot 2019-09-22 at 10.57.53 PM.png

Screen Shot 2019-09-22 at 10.58.53 PM.png

Screen Shot 2019-09-22 at 10.59.45 PM.png

Screen Shot 2019-09-22 at 11.00.22 PM.png

Cognito

Screen Shot 2019-09-22 at 11.01.31 PM.png

Screen Shot 2019-09-22 at 11.02.06 PM.png

Screen Shot 2019-09-22 at 11.02.46 PM.png

Screen Shot 2019-09-22 at 11.03.34 PM.png

Screen Shot 2019-09-22 at 11.04.17 PM.png

Screen Shot 2019-09-22 at 11.04.42 PM.png

Screen Shot 2019-09-22 at 11.05.02 PM.png

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13888022#overview

Lambda

Screen Shot 2019-09-23 at 6.48.13 AM.png

Screen Shot 2019-09-23 at 6.49.04 AM.png

Screen Shot 2019-09-23 at 6.49.48 AM.png

Screen Shot 2019-09-23 at 6.50.25 AM.png

Screen Shot 2019-09-23 at 6.51.20 AM.png

Screen Shot 2019-09-23 at 6.51.49 AM.png

Screen Shot 2019-09-23 at 6.52.47 AM.png

Screen Shot 2019-09-23 at 6.53.16 AM.png

  • Services That Invoke Lambda Functions Synchronously
  1. Elastic Load Balancing (Application Load Balancer)
  2. Cognito
  3. Lex
  4. Alexa
  5. API Gateway
  6. CloudFront (Lambda@Edge)
  7. Kinesis Data Firehose
  • Services That Invoke Lambda Functions Asynchronously
  1. Simple Storage Service
  2. Simple Notification Service
  3. Simple Email Service
  4. CloudFormation
  5. CloudWatch Logs
  6. CloudWatch Events
  7. CodeCommit
  8. Config

Screen Shot 2019-09-23 at 7.02.29 AM.png

Screen Shot 2019-09-23 at 7.02.06 AM.png

https://www.udemy.com/aws-certified-solutions-architect-associate/learn/lecture/13888044#overview

AWS Essentials

  • High Availability Tools
    • ELB
    • E-IP address (thru IP masking)
    • Route 53
    • CloudFront
    • Auto Scaling
    • CloudWatch
  • Fault-Tolerant Tools
    • SQS
    • S3
    • RDS (read replica)
  • AWS Managed Services (AMS):
    • S3
    • RDS
    • Redshift
    • DynamoDB
    • CloudFront
    • ELB
    • Lambda
    • Elastic File System
    • Elastic Transcoder
    • SES
    • WorkSpaces
    • CloudSearch
    • Elastic MapReduce?
  • Serverless Services:
    • Lambda
    • DynamoDB
    • ECS (through Fargate)
  • Server-based Services:
    • EC2
    • RDS
    • Redshift
    • EMR
  • EBS for EC2 and RDS.
  • EFS for big data and varied use cases.
  • Avoid single point of failure (SPOF) achieved through ELB, Auto Scaling, Route53, EC2 auto-recovery, OpsWorks (configuration management service; chef/ puppet) and Elastic Beanstalk.
  • Customers should be aware that their responsibilities may vary depending on the AWS services chosen.
    • EC2, you are responsible for applying OS and security patches.
    • RDS, AWS is responsible for the same.
  • Support Concierge Team (via Enterprise Support plan) are AWS billing and account experts.
    • 24 x7 access to AWS billing and account inquires.
    • Guidance and best practices for billing allocation, reporting, consolidation of accounts, and root-level account security.
    • Access to Enterprise account specialists for payment inquiries, training on specific cost reporting, assistance with service limits, and bulk purchases.
  • AWS Abuse team can assist you when AWS resources are being used to engage in the following types of abusive behavior:
    • I. Spam: You are receiving unwanted emails from an AWS-owned IP address, or AWS resources are being used to spam websites or forums.
    • II. Port scanning: Your logs show that one or more AWS-owned IP addresses are sending packets to multiple ports on your server, and you believe this is an attempt to discover unsecured ports.
    • III. Denial of service attacks (DOS): Your logs show that one or more AWS-owned IP addresses are being used to flood ports on your resources with packets, and you believe this is an attempt to overwhelm or crash your server or software running on your server.
    • IV. Intrusion attempts: Your logs show that one or more AWS-owned IP addresses are being used to attempt to log in to your resources.
    • V. Hosting objectionable or copyrighted content: You have evidence that AWS resources are being used to host or distribute illegal content or distribute copyrighted content without the consent of the copyright holder.
    • VI. Distributing malware: You have evidence that AWS resources are being used to distribute software that was knowingly created to compromise or cause harm to computers or machines on which it is installed.
  • AWS Security team is responsible for the security of services offered by AWS.
  • AWS Customer Service team is at the forefront of this transformational technology assisting a global list of customers that are taking advantage of a growing set of services and features to run their mission-critical applications. The team helps AWS customers understand what Cloud Computing is all about, and whether it can be useful for their business needs.
  • AWS Infrastructure Event Management is a short-term engagement with AWS Support, included in the Enterprise-level Support product offering, and available for additional purchase for Business-level Support subscribers. AWS Infrastructure Event Management partners with your technical and project resources to gain a deep understanding of your use case and provide architectural and scaling guidance for an event. Common use-case examples for AWS Event Management include advertising launches, new product launches, and infrastructure migrations to AWS. Helps architectural and scaling guidance.
  • AWS Personal Health Dashboard provides alerts and remediation guidance when AWS is experiencing events that may impact you. While the Service Health Dashboard displays the general status of AWS services, Personal Health Dashboard gives you a personalized view into the performance and availability of the AWS services underlying your AWS resources. The benefits of the AWS personal health dashboard include:
    • Personalized View of Service Health
    • Proactive Notifications
    • Detailed Troubleshooting Guidance
  • Shared Controls: Both AWS and customers responsible for on their own layers:
    • Patch Management – AWS is responsible for patching and fixing flaws within the infrastructure, but customers are responsible for patching their guest OS and applications.
    • Configuration Management – AWS maintains the configuration of its infrastructure devices, but a customer is responsible for configuring their own guest operating systems, databases, and applications.
    • Awareness & Training – AWS trains AWS employees, but a customer must train their own employees.
  • Inherited Controls: Customer fully inherits physical controls and environmental controls from AWS.
  •  AWS Artifact is a self-service audit artifact retrieval portal that provides our customers with on-demand access to AWS’ compliance documentation and AWS agreements. You can use AWS Artifact Reports to download AWS security and compliance documents, such as AWS ISO certifications, Payment Card Industry (PCI), and System and Organization Control (SOC) reports. You can use AWS Artifact Agreements to review, accept, and track the status of AWS agreements such as the Business Associate Addendum (BAA).
  • Pay as you go” – On-demand
  • Save when you reserve” – Upfront and discounted hourly rate.
  • Pay less as AWS grows” or “AWS Economies of Scale” – Discounts that you get over time as AWS grows. For example, AWS has reduced the per GB storage price of S3 by 80% since the service was first introduced in 2006.
  • Pay less by using more” – Volume based discounts and as your usage increases. For services such as S3, pricing is tiered, meaning the more you use, the less you pay per GB.
  • S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path.
  • RDS Read Replicas provide enhanced performance and durability for database (DB) instances. Avoid SPOF,  disaster recovery capabilities and allows you to scale out globally.
  • RDS is multi-AZ.
  • RDS Aurora is multi-region (include multi-AZ). Aurora is up to 5 times faster than standard MySQL databases and 3 times faster than standard PostgreSQL db.
  • Security scales with your AWS Cloud usage. No matter the size of your business, the AWS infrastructure is designed to keep your data safe.
  • APN Consulting Partners : responsible for complete DT from on-prem to AWS.
  • APN Technology Partners: F5, MSFT are tech partners to AWS.
  • AWS Professional Services: Work on specific outcomes related to enterprise cloud adoption. Accenture helps to do one specific cloud adoption and not complete DT.
    • Created the AWS Cloud Adoption Framework (AWS CAF) to help organizations design and travel an accelerated path to successful cloud adoption.
  • Technical Account Manager (TAM) : technical point of contact who provides advocacy and guidance to help plan and build solutions using best practices and proactively keep your AWS environment operationally healthy. TAM is available only for the Enterprise support plan.
  • Service Limits:
    • Monitor SL using Trusted Advisor
    • AWS maintains service limits for each account to help guarantee the availability of AWS resources, as well as to minimize billing risks for new customers. Some service limits are raised automatically over time as you use AWS, though most AWS services require that you request limit increases manually. Most service limit increases can be requested through the AWS Support Center by choosing Create Case and then choosing Service Limit Increase.
  • Data in Transit protection by using SSL or by using client-side encryption.
  • Data in Rest protection by using server-side encryption.
  • AWS Management console: user name or password.
  • AWS API: API token.
  • AWS SDK: Access key ID and secret access key.
  • AWS CLI: Private/ Public Key pair.
  • CloudFormation and Auto Scaling are free to use but provisioned resources are charged.
    • Quick Start is a CF template by technology partners. F5-BigIP CF template helps to install BigIP instance in AWS.
  • CloudWatch Logs aggregates, monitor, store, and access your log files from EC2 instances, CloudTrail, Route 53, and other sources.
  • Right DB tech based on # of R/ W ops, data storage, data source, latency, throughput, data model, and nature of queries.
  • Tagging: Logical groupings of resources based on organizationally relevant dimensions, project, Cost center, Development environment, Application or Department. For example, if you tag resources with an application name, you can track the total cost of a single application that runs on those resources.
  • Tagging best practices:
    • Always use a standardized, case-sensitive format for tags, and implement it consistently across all resource types.
    • Consider tag dimensions that support the ability to manage resource access control, cost tracking, automation, and organization.
    • Implement automated tools to help manage resource tags.
    • Err on the side of using too many tags rather than too few tags.
    • Remember that it is easy to modify tags to accommodate changing business requirements, however consider the ramifications of future changes, especially in relation to tag-based access control, automation, or upstream billing reports.
    • Usages:
      • Visualize information about tagged resources in one place, in conjunction with Resource Groups.
      • View billing information using Cost Explorer and the AWS Cost and Usage report.
      • Send notifications about spending limits using AWS Budgets.
  • AWS Config, you can discover existing and deleted AWS resources, determine your overall compliance against rules, and dive into configuration details of a resource at any point in time. These capabilities enable compliance auditing, security analysis, resource change tracking, and troubleshooting.
  • To save AWS cost:
    1. Terminate all unused EC2 instances.
    2. Delete all the EBS volumes attached to them.
    3. Release the un-utilized Elastic IPs.
    4. Delete the ELBs.
  • Decommissioning process: Use DoD 5220.22-M (“National Industrial Security Program Operating Manual “) or NIST 800-88 (“Guidelines for Media Sanitization”) to destroy data as part of the decommissioning process.
  • AWS Application Discovery Service helps systems integrators quickly and reliably plan application migration projects by automatically identifying applications running in on-premises data centers, their associated dependencies, and their performance profiles.
  • You can use a server certificate provided by AWS Certificate Manager (ACM) or one that you obtained from an external provider.
    • Use ACM or IAM to store and deploy server certificates.
    • Use IAM as a certificate manager only when you must support HTTPS connections in a region that is not supported by ACM.
    • IAM supports deploying server certificates in all regions, but you must obtain your certificate from an external provider for use with AWS.
  • Route 53 is not responsible for creating SSL certifications.
  • When selling a reserved instance on the Amazon EC2 Reserved Instance Marketplace, you only have the option to set an upfront price for the instance.
  • Bootstrapping: Custom code or script to install required s/w or copy resource or define resource state (prod, dev, test). Same script used for all deployments.
  • Golden Images: a snapshot of a particular state of that resource. When compared to the bootstrapping approach, a golden image results in faster start times and removes dependencies to configuration services or third-party repositories. This is important in auto-scaled environments where you want to be able to quickly and reliably launch additional resources as a response to demand changes.
  • Non-explicit deny: When a new IAM user is created, that user has NO access to any AWS service. Use allowed via IAM permission and access policies.
  • Global Tables builds upon DynamoDB’s global footprint to provide you with a fully managed, multi-region, and multi-master database that provides fast, local, read and write performance for massively scaled, global applications.
  • EC2 Cost factor:
    • Compute
    • Storage
    • Data Transfer Out
  • Resource Groups:
    • Create a custom console that organizes and consolidates information based on your project and the resources that you use.
    • AWS Management Console is organized by AWS service.
  • Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers performance improvements from milliseconds to microseconds – even at millions of requests per second. DAX adds in-memory acceleration to your DynamoDB tables without requiring you to manage cache invalidation, data population, or cluster management.
  • Reservation models for stable applications:
    • EC2 reserved instances
    • RDS reserved instances
    • DynamoDB Reserved Capacity (capacity means latency and throughput)
    • ElasticCache Reserved Nodes
    • Redshift Reserved Nodes
  • Amazon ElasticCache for Redis is a blazing fast in-memory data store that provides sub-millisecond latency to power internet-scale real-time applications.
  • AWS IAM console or AWS CLI to enable a virtual MFA device for an IAM user.
  • AWS strongly recommend that you DON’T use AWS account root user for your everyday tasks, even the administrative onesAWS account root user perform only a few account and service management tasks.Use the root user only to create your first IAM user with administrative privileges, and use this Admin user for all your work. Event on root account compromised.
    • Change the user name and the password of the root user account and all of the IAM accounts that the administrator has access to.
    • Rotate (change) all access keys for those accounts.
    • Enable MFA on those accounts.
    • Put IP restriction on all Users’ accounts.              
  • OpsWorks and Elastic Beanstalk automatically restarts resources after terminating.
  • You can find a paid AMI using the EC2 console, AWS Marketplace and AWS CLI.
  • Amazon DevPay is a simple-to-use online billing and account management service that makes it easy for businesses to sell apps built on, or run on top of AWS.
  • AWS free security resources include AWS Security Blog, Security Bulletins, Provable Security, Whitepapers, Advanced Innovation, Developer Documents, Articles and Tutorials, Training, Compliance Resources and Testimonials.

AWS Terms

Sno Definitions Explanations
1 Region Global Infra – Geographic region where AWS located.
2 AZ
  • Global Infra.
  • Consist of one or more DCs.
  • Isolated DC (from failure in other AZ), when grouped together multiple AZ’s (us-east-1a, 1b, 1c, 1d) forms Region (us-east-Virginia).
  • Multiple AZs within Region are for High-Availability and Fault-Tolerant.
3 Edge Locations Global Infra – CDN – A site that CloudFront uses to cache copies of your content for faster delivery to users at any location.
4 IAM Users Individuals
5 IAM Groups Easy manage users and their access
6 Resources S3, EC2, etc.
7 Roles Manage resource and temporary authz to resource.
8 Policies User/ resource access definitions. The policy is a JSON document that consists of:

  • Actions: R/ W Ops
  • Resources: RDS
  • Effect: ALLOW/ DENY
  • Conditions: SPECIFIC SCHEMA of  RDS
9 VPC
  • Logical isolated n/w section of AWS to places resources.
  • Includes: virtual n/w, own IP range, subnets, route table, n/w gateways.
  • VPC spans across AZs in particular region.
  • When VPC gets created, it spans across multiple AZs (us-east-1a, 1b, 1c, 1d) within same Region (us-east-Virginia).
10 IGW

NAT Gateway

H/W + S/W based gateway to interact b/w VPC and internet.

In public subnet, to allow private subnet resource to access internet.

11 Route-Table
  • Set of rules called “routes” determines where n/w traffic is directed b/w IGW and subnets.
  • There is a default “main” RT.
  • RT can communicate with other RT.
  • Work at VPC level.
12 NACL
  • Optional firewall to control traffic b/w route-table and subnets.
  • EC2 creates Default  InBound and OutBound “rules” allows all traffic.
  • Evaluated from lowest to highest rule #. Rule * is deny (last rule).
  • Stateless – newly created NACL is everything denied by default (both InBound and OutBound Rule *).
  • Work at subnet level.
13 Security Groups
  • Allow/ deny traffic at EC2 instance.
  • Same as firewall in the desktop.
  • All rules will be evaluated before making decision to allow/ deny. (different from NACL – lowest rule # executed first and higher rule # is discarded).
  • InBound traffic is denied and OutBound traffic is allowed by  default.
  • Traffic is denied unless there is a specific EXPLICIT ALLOW rule (i.e., no rule means, such traffic is DENIED).
  • Works at EC2 instance level.
14 Subnet
  • Divide VPC.
  • Within VPC, add one or more subnets in each AZ and each subnet must reside in particular AZ only and cannot span across AZs.
  • us-east-Virginia region has 4 AZs (us-east-1a, 1b, 1c, 1d), so 4 subnets (for each AZ) gets created within one VPC when chose Virginia as a region.
  • Subnet Groups: Grouping of subnets
15 Public and Private Subnet
  • Public: Route to internet via RT and IGW.
  • Private: No route to internet via RT and IGW, but only RT with route to other subnets (can be Public or Private) within same VPC.
16 S3 Buckets
  • Root level folders.
  • Buckets are located within region.
  • Key -> Object -> BucketName.
  • Filename -> Actual File -> Unique name across globe.
17 S3 Folders Subfolders of buckets.
18 S3 Objects Files within buckets/ folders.
19 S3 Lifecycle Policy Rules that change the storage class of S3 objects,
20 S3 Permissions A granular control over who can view, access, use specific buckets and objects.
21 S3 Versioning
  • Versioning for buckets/ objects. Increases storage by having versions.
  • Status are ON or OFF.
  • Once ON, you can only SUSPEND and not OFF.
  • SUSPEND prevents versioning and older versions will remains. Applies at bucket level.
22 AMI
  • OS image with software packages and required settings (permissions, EBS, network card mappings).
  • 3 types: Community, Marketplace and My AMIs.
23 EBS
  • Block storage for EC2, persist data beyond lifetime of EC2 instance.
  • When attached to EC2, EBS must be from same AZ.
  • 3 volume types:
    • GP (SSD)
    • Provisioned IOPS (SSD)
    • Throughout Optimized (HDD)
    • Cold (HDD)
    • EBS Magnetic (HDD)
  • Cost factor:
    • Volumes
    • IOPS
    • Snapshots

Screen Shot 2019-09-18 at 6.39.06 AM

24 IOPS
  • I/O Operation per Seconds.
  • SSD cap: 256KiB
  • HDD cap: 1024KiB
25 Root vs EBS Volumes Every EC2 must have root volume (gone when EC2 recycled) but additional EBS volume can be added (persist beyond lifetime of EC2).
26 Snapshots
  • Image or template of an EBS volume that can be stored as backup.
  • Snapshots cannot be attach or detach to EC2. Snapshot can be restore by creating a new EBS volume.
27 IP Addressing Provide EC2 with public IP address.

Private IP Address: By default, all EC2 has private IP address which helps to communicate between other EC2 within same VPC.

Public IP Address: By default, EC2 can be launched with or without a public IP address. It helps to communicate with internet.

28 Lightsail Virtual Private Server (Pod based env) includes VM, SSD-based storage, data transfer, DNS management, and a static IP address.
29 Polly ML turns text into lifelike speech
30 Rekognition ML; image analyzer
31 RDS
  • Relational database in the cloud.
  • Resizable capacity, hardware provisioning, database setup, patching and backups.
  • RDS doesn’t support AutoScaling like EC2 instances, but it does support manual horizontal scaling (by adding read replicas) and manual vertical scaling (by upgrading/downgrading an existing instance).
  • SQL DB service supports following DB engines: Amazon Aurora, MySQL, MariaDB, PostgreSQL, Oracle, MS SQLServer.
  • No backup storage service cost, but storage cost per GB.
32 DynamoDB
  • NoSQL DB service (serverless) supports MongoDB, CassandraDB, Oracle NoSQL.
  • JSON doc storage type.
33 DB Migration Service Migrate or replicate your existing databases to RDS.
34 SSH Tunneling
  • Tunneling from internet to RDS via 22/ 443 port.
  • Using SSH to access a resource without a public IP address via a resource with a  public IP address  (inside of a VPC).
35 Systems Manager Gives visibility and control of your infrastructure on AWS. Supports tools:

  • Resource Groups
  • Insights Dashboard
  • Run Command
  • State Manager
  • Inventory
  • Maintenance Window
  • Patch Manager
  • Automation
  • Parameter Store
  • Distributor
  • Session Manager
36 SNS Async Pub/ Sub messaging and mobile notification services supports Amazon SQS, HTTP/S, email, Lambda, SMS, APN, Google push notification.
37 Topics Labeling/ grouping of different endpoints that you send messages to.
38 Publishers Human/ alarm/ event that triggers the messages to be sent.
39 Subscribers Endpoints that a topic sends messages to (i.e. email address, phone).
40 CloudWatch
  • Monitors/ collects/ aggregates metrics/ logs of resources.
  • Set alarms and automatically react to changes in AWS.
  • Sets threshold to trigger alarms and that can trigger an action (SNS message).
41 CloudWatch  Thresholds Maximum allowed value to not trigger an alarm.
42 CloudWatch Alarms
  • Sends notifications or takes pre-defined decisions.
  • 3 types: ALARM, INSUFFICIENT, OK.
43 CloudWatch Dashboard To view resource metrics (EC2 CPU Util, S3 bucket size, Billings over-limit)
44 CloudWatch  Events Events provides a near real-time stream of system events that describe changes to AWS.
45 CloudWatch Rules
  • Write rules to indicate which events are of interest to your application and what automated actions to take when a rule matches an event.
  • Eg: set a rule to invoke Lambda or notify an  SNS topic.
46 CloudWatch Log Insights Enables to drive actionable intelligence from logs to address operational issues without needing to provision servers or manage software.
47 ELB
  • Distributes incoming traffics across multiple EC2 in multiple AZ.
  • Fault tolerant increases by having ELB.
  • Detects unhealthy instances and routes traffic only to healthy instances.
48 Application LB
  • Routing decisions at the Application layer 7 (HTTP/ HTTPS).
  • Intelligent and supports path-based routing.
  • Dynamic host port mapping: Route requests to one or more ports on each EC2 instance or container instances in VPC.
49 Network LB
  • Routing decisions at the Transport layer 4 (UDP OR TCP/SSL).
  • Speed and handles millions of requests per second.
  • After the LB receives a connection, it selects a target from the target group for the default rule using a flow hash routing algorithm.
50 Classic LB
  • Routing decisions at either Transport layer (TCP/SSL) or Application layer (HTTP/HTTPS).
  • Supports either EC2-Classic or a VPC (Previous generation LB).
  • Static host port mapping: Requires fixed relationship between load balancer port and container instance port.
51 LB Health Check Checks the health of EC2 by HTTP or TCP pings with Response Timeout/ Interval/ Unhealthy and Healthy Threshold.
52 Load Balancer Capacity Unit (LCU) Based on the highest usage dimension of one of the following:

  • Number of new connections per second (up to 25 new connections per second is one LCU)
  • Number of active connections per minute (up to 3,000 active connections per minute is one LCU)
53 Auto Scaling
  • Monitors applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost.
  • Works closely with ELB, as ELB checks health of EC2 which triggers the AS to add or remove instances based on configured AS Group Policy (when to launch AS with Scale-In/ out policy).
  • Free to use, pay for the launched resource.
54 Auto Scaling Groups
  • Group contains a collection of EC2 instances (within multi-AZ & VPC) that are treated as a logical grouping for the purposes of automatic scaling and management.
  • Enables to use EC2 Auto Scaling features such as health check replacements and scaling policies (when to add servers. Eg: CPU Utils is > 70%).
  • Core functionalities:
    • Maintaining the # of instances in an AS group.
    • Automatic scaling (increase or decrease) of EC2.
55 Auto Scaling Launch Configuration EC2 template used when Auto Scaling needs to add additional server to AS Group when required (can add bootstrap script and other configurations to start instances).
56 Route 53
  • Highly available and scalable cloud DNS web service.
  • 3 functionalities:
    • Domain Registration: Register domains.
    • DNS Service: friendly domains -> IP address and responds to DNS queries using a global n/w authoritative DNS servers.
    • Health Checking: Sends automated requests over the internet to apps in EC2 to verify that it’s reachable, available and functional.
  • Route 53 automatically sends your DNS record information to DNS servers AND it is also where you decide where traffic request for that domain/IP address are routed.
57 Hosted Zones Registered domain path routes information and its IP address.

  • Eg: example.com, it’s subdomains (acme.example.com).
58 Record Sets Information about the resource record.
59 Lambda
  • Serverless and event-driven computing.
  • “trigger” can be added to launch Lambda.
  • Role must be created to run Lambda.
  • Required memory allocation and time-out needs to be set.
  • Sub-second metering and charges based on:
    • # of execution request.
    • Execution duration.
60 Instance Purchasing Options Options to purchase EC2 instances.
61 EC2 – On-Demand Instances Pay, by the second, for the instances that you launch. Expensive.
62 EC2 – Reserved Instances
  • Purchase, at a significant discount (75% discount), instances that are always available, for one to three years.
  • 3 Options:
    1. All Upfront RI (AURI): nothing to pay.
    2. Partial Upfront RI (PURI): pay end of month.
    3. No Upfront RI (NURI): pay end of month. A successful billing history is required before you can purchase.
  • Standard RI (no change in instance type)
  • Convertible RI (change in instance type)
  • Scheduled RI (recurring schedule)
63 EC2 – Scheduled Instances Purchase instances that are always available on the specified recurring schedule, for a one-year term.
64 EC2 – Spot Instances Request unused EC2 instances, which can lower your Amazon EC2 costs significantly (90%).
65 EC2 – Dedicated Hosts Pay for a physical host that is fully dedicated to running instances, and bring existing per-socket, per-core, or per-VM software licenses to reduce costs.
66 EC2 – Dedicated Instances Pay, by the hour, for instances that run on single-tenant hardware. Share dedicated rack space within server host.
67 EC2 – Capacity Reservations
  • Reserve capacity for EC2 instances in a specific AZ for any duration.
  • When you create a Capacity Reservation, you specify the AZ in which you want to reserve the capacity, the number of instances for which you want to reserve capacity, and the instance attributes, including the instance type, tenancy, and platform/OS.
68 Elastic Beanstalk
  • PaaS – Custom application management platform for non-developer.
  • Upload code and define the types of software, Beanstalk will build the code and create the environments.
  • Automatically handles:
    • Capacity provisioning
    • Load balancing
    • Auto scaling
    • Monitoring
  • Application container top of AWS.
  • Free; You pay for usage.
69 CloudFront
  • CDN. Customize cache, define the TTL.
  • Origin:  Gets data from S3, ELB, Lambda, EC2.
    • S3, ELB or EC2 as origins for your applications, and Lambda@Edge to run custom code closer to customers’ users.
    • No data transfer fee for AWS origins (S3, EC2 or ELB)
  • Usages:
    • Static Website Content Delivery
    • Serve On-Demand or Live Streaming Video
    • Encrypt Specific Fields Throughout System Processing
    • Customize at the Edge – send error page, authn/ authz before sending to origin server
    • Serve Private Content by using Lambda@Edge.
70 CloudTrail
  • Records all API calls made, delivers logs to S3 buckets which include identity, source IP and request and response details.
  • Doesn’t log OS system log and Database request/ response.
  • Used for governance, compliance, and risk auditing.
71 CloudFormation
  • Infrastructure as a Code – Template based infrastructure management to avoid repeated task of creating infra.
  • Declarative programming of stack provisioning in AWS.
  • Template file (in source control as JSON) is an input to CloudFormation to manage 100’s of AWS resource.
  • Permissions are required to successfully create the stack.
  • Free to use, but resource usage cost.
72 DB tools
  • DocumentDB: Word document database service with MongoDB
  • Neptune: Graph DB
  • TimeStream: Timeseries DB (IoT)
  • DB Migration Service: Migrate DB to AWS with minimal downtime
73 Shield
  • DDoS protection services (block UDP reflection, syn floods)
  • Two types:
    • Standards (automatic for all AWS services; inline mitigation)
    • Advanced (paid; 24/7 support team; advanced attack mitigation)
  • AWS services with built-in DDoS mitigation include:
    • Route 53
    • CloudFront
    • WAF
    • ELB
    • VPCs and Security Groups?
74 Trusted Advisor
  • Environment optimization service.
  • Real time guidance to help to provision resources by following AWS best practices.
  • Automatically analyze resources that proliferated and that needs to be tracked.
  • Gives best practice on 4 categories:
    1. Cost optimization
    2. Performance
    3. Security
    4. Fault tolerance
  • 7 core checks free to everyone:
    • S3 bucket permission
    • Security Groups (specific ports unrestricted)
    • IAM use/ MFA on Root Account
    • Keys non-rotation
    • EC2 non-patching
    • EBS public snapshots (check if snapshots are publicly readable)
    • RDS public snapshots (check if RDS is public)
    • Service limits (eg: 20 EC2 limit)
75 Support Plans
  • Basic: core trusted advisor, no technical support, submit bug, feature request and service limit.
  • Developer: Basic + cloud support associates, guidance less than 24 hours, impairment within 12 hours response.
    • Provides general guidance when you request Architecture Support.
  • Business: full trusted advisor, cloud support engineer 24/7, email, chat, phone, 1-hour response, contextual guidance on use case (per account basis).
  • Enterprise: Tech Account Manager (TAM), full trusted advisor, cloud support engineer 24/7, email chat phone, 15 min response, consultative review on use case, apply to all accounts, well architected review, access to online labs.
76 ECS
  • Elastic Container Service (Amazon ECS) is a container orchestration service that supports Docker containers.
  • ECS install and operate container orchestration software, manage and scale a cluster of virtual machines, or schedule containers on VMs.
  • API calls to launch and stop Docker-enabled applications, query the complete state of your application, and access many familiar features such as IAM roles, security groups, load balancers, CloudWatch Events, CloudFormation templates, and CloudTrail logs.
77 Fargate
  • Fargate manages a cluster of servers and schedule placement of containers on the servers.
  • Fargate works with ECS that manages EC2 (spin new EC2 with defined instance types, provision and scale clusters, or patch and update each server).
  • Fargate takes care of:
    • Task placement strategies, such as binpacking or host spread.
    • Tasks are automatically balanced across AZs.
  • Similar to AS: AS for EC2, Fargate for ECS (running containers in EC2).
78 Redshift
  • Data Warehouse tool – OLAP (online analytical process)
  • Queries to data lake, S3.
  • Fork of PostgreSQL 8.0.2
  • Connect to JDBC/ODBC
  • SQL compliant and Parallel queries.
  • Quicksight – BI tool.
79 X-Ray
  • Managed debugger for three-tier applications to complex microservices applications consisting of thousands of services or for Lambda functions..
  • X-Ray provides an end-to-end view of requests as they travel through your application, and shows a map of your application’s underlying components.
80 Dev tools
  • Corretto: Open JDK
  • Cloud Development Kit: SDK for AWS cloud using Cloud Formation
  • Cloud9: IDE
  • CodeCommit: GIT source control service
  • CodeBuild: CI- Build and test code
  • CodePipeline: Continous Delivery- binaries ready to be deployed
  • CodeDeploy: Continous Deployment- to production
  • CodeStar: CI/ CD (build, delivery and deploy)
  • DeviceFarm: Testing platform for different devices
81 Snowball
  • A service that accelerates transferring large amounts of data into and out of AWS using physical storage devices, bypassing the Internet (import and export from AWS is supported).
  • This transport is done by shipping the data in the devices through a regional carrier.
  • 256 bit encryption.
  • Snowball: No compute; 50 TB to 80 TB
  • Snowball Edge: Mini AWS in your hand. Snowball with compute; 100 TB.
  • Snowmobile: No compute; 100TB to 100 PB (exabytes)
82 Storage Gateway
  • A hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage.
  • These include moving tape backups to the cloud, reducing on-premises storage with cloud-backed file shares, providing low latency access to data in AWS for on-premises applications, as well as various migration, archiving, processing, and disaster recovery use cases.
  • The gateway connects to S3, S3 Glacier, S3 Glacier Deep Archive, EBS, and Backup, providing storage for files, volumes, snapshots, and virtual tapes in AWS.
83 Cognito Lets you add user sign-up, sign-in, and access control to your web and mobile apps using social and enterprise logins.
84 Directory Services
  • Directory Service provides multiple ways to use Amazon Cloud Directory and Microsoft Active Directory (AD) with other AWS services.
  • Directory Service provides multiple directory choices for customers who want to use existing Microsoft AD or LDAP–aware applications in the cloud.
  • Cloud Directory can create multiple dimensions of directories for a variety of use cases, such as organizational charts, course catalogs, and device registries.
  • Automatically scales to hundreds of millions of objects and provides an extensible schema that can be shared with multiple applications.
85 GuardDuty
  • GuardDuty is a threat detection event based service that continuously monitors for malicious activity and unauthorized behavior to protect your AWS accounts and workloads.
  • Paid service uses ML, anomaly detection, and integrated threat intelligence to identify and prioritize potential threats.
  • GuardDuty analyzes tens of billions of events across multiple AWS data sources, such as AWS CloudTrail, Amazon VPC Flow Logs, and DNS logs.
86 Macie
  • Identify and protect sensitive data stored in the AWS.
  • Amazon Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or moved.
87 KMS
  • KMS is key management service and control the use of encryption across a wide range of AWS services.
  • Uses HSM that have been validated under FIPS 140-2, or are in the process of being validated, to protect your keys.
  • KMS is integrated with CloudTrail to provide audit logs.
88 Inspector
  • Automated security assessment service (Agent and API based) for security and compliance of applications deployed on AWS.
  • Inspector automatically assesses applications for exposure, vulnerabilities, and deviations from best practices (based on template/ rules).
  • Then inspector produces a detailed list of security findings (via console or API) prioritized by level of severity.
89 SageMaker
  • ML core service.
  • SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action.
90 API Gateway
  • API Gateway is to create, publish, maintain, monitor, and secure APIs at any scale.
  • From AWS Management Console, can create REST API and WebSocket APIs that act as a “front door” for applications to access data, business logic, or functionality from AWS backend services.
  • Pay only for the API calls you receive and the amount of data transferred out and, with the API Gateway tiered pricing model.
91 WorkMail WorkMail is a secure, managed business email and calendar service with support for existing desktop and mobile email client applications. Uses IMAP protocol (receive email; stores in server and client).
92 SES Simple Email Service (Amazon SES) is a cloud-based email sending service designed to help digital marketers and application developers send marketing, notification, and transactional emails. Uses SMTP protocol (sent emails).
93 SQS
  • Simple Queue Service (SQS) is a message queuing service that enables you to decouple applications.
  • Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.
  • Two types:
    • Standard queues: No order and at-least-once message delivery.
    • FIFO queues: Order and exactly once message delivery.
94 SWF
  • Simple Workflow helps developers build, run, and scale background jobs that have parallel or sequential steps.
  • If your app’s steps take more than 500 milliseconds to complete, you need to track the state of processing, and you need to recover or retry if a task fails, SWF can help you.
95 EMR
  • Big data platform with ETL tool (MapReduce).
  • Process large number of data sets using clusters of virtual servers.
  • Using open source tools such as Apache Spark, Hive, HBase, Flink, and Presto.
96 ElasticSearch
  • Search in clusters.
  • Works with Kibana, Logstash, ELK stack.
97 Data Lake A data lake is a storage repository (flat architecture instead of hierarchical fashion) that holds a vast amount of raw data in its native format until it is needed.
98 Lake Formation
  • AWS Lake Formation set up a secure data lake in days.
  • A data lake enables you to break down data silos and combine different types of analytics to gain insights and guide better business decisions.
99 ElasticCache
  • In-memory key-value managed datastore.
  • Popular choice for caching, session management, gaming, leaderboards, real-time analytics, geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.
  • Two types:
    1. Redis: Remote Dictionary Server (Redis)
    2. Memcache
100 OpsWorks
  • Configuration Management Service: Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers.
101 Root access keys AWS recommends that you delete your root access keys because you can’t restrict permissions for the root user credentials.

  • If you want to manage services that require administrative access create an IAM user, grant administrator access to that user, then use those credentials to interact with AWS.
  • User with root access key has unrestricted access to all the resources in your account, including billing information.
  • Don’t create one unless you absolutely need to.
  • Rotate (change) the access key regularly.
102 WAF
  • Web application firewall and works on Application-LB.
  • Layer 7 content filtering to support block/allow the request
  • Write rules that blocks IP address, HTTP address, URI strings.
  • Rate limiting per client IP.
  • Threat mitigation.
103 Organization
  • OU – management group for different AWS accounts.
  • Create organization using the root account or master account.
  • AWS Organizations has four main benefits:
    • Centrally manage access polices across multiple accounts.
    • Automate account creation and management.
    • Control access
    • Consolidate billing across multiple accounts
104 Assurance Programs
  • AWS is compliance at infrastructure level and customer should be compliance to the data they store there.
  • Certification/ Attestation by third party/auditors.
  • Laws regulations and privacy
  • Alignments and frameworks
  • Certifications: Cloud Security Alliance, ISO 9001, 27001, 27017, 27018, PCI DSS Level 1, SOC 1, SOC 2, SOC 3, within US (FedRAMP, FIPS, FISMA, HIPAA, ITAR, MPAA)
  • HIPAA Compliance: Designed to secure Protected Health Information (PHI).
105 Auditing and Compliance
  • Configuration: for resources inventory, configuration history, change notifications, determine compliance against rules, enables compliance auditing, security analysis and change tracking.
  • Service Catalog: manage catalogs of approved IT services, achieve consistent governance, customer defines portfolios, product, define cloud formation templates.
  • Artifacts: access reports and details of security controls, on demand access to security compliance documents, demonstrate security and compliance.
  • CloudTrail: records all API calls made, delivers logs to S3 buckets which include identity, source IP and request and response details, doesn’t log DB and OS system log.
  • Encryption and Key management: many services use encryption.
106 Vulnerability and Penetration Testing

AWS customers are welcome to carry out security assessments and penetration tests against their AWS infrastructure without prior approval for 8 services:

  • EC2, ELB, CloudFront, RDS, Aurora, NAT and API Gateways, Lambda and Lambda Edge functions, Lightsail, Elastic Beanstalk environments.
107 Cost Management Prediction Tool
  • Cost Calculators/ Simple Monthly Calculator for expected cost. Select region, services, OS, size, billing options, it gives estimated cost. No visualize.
  • Total Cost of Ownership (TCO): to calculate expected cost difference between having on-premise data center and AWS. Calculate VM, DB instances in on-prem to cloud.
108 Cost Management Incurred Tools
  • Cost Explorer: visualize and drill down the accrued expenses in AWS, look the cost by month, services, usage, tags.
  • Cost and Usage Reports: access highly detailed billing information, CSV files save to S3 buckets, ingest reports into redshift or Quicksight for analysis, usage listed for each service, usage listed for tags, can aggregate to daily or monthly totals
  • AWS Billing and Cost Management: Billing History.
109 Kinesis
  • Kinesis collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information.
  • Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.
110 Elastic IP address
  • Static IP address, mask failure when occur an re-route traffic to healthy server.
  • Doesn’t incur charges as long as the following conditions are true:
  1. Elastic IP address is associated with an EC2 instance.
  2. Elastic IP associated to EC2 is running.
  3. EC2 has only one Elastic IP address attached to it.
  • Charged by the hour for each Elastic IP address that doesn’t meet these conditions.
111

S3

  • S3 names must be unique across all AWS accounts world wide, and must follow specific naming rules.
  • Durability is a fault-tolerant.
  • Backed with the Amazon S3 Service Level Agreement for availability.
  • The only way to set an object’s storage class to Glacier is through CLI or SDK, no from AWS mgmt console.
  • Storage cost and Request pricing are wrt Regions.
  • S3 Lifecycle Management for automatic migration of objects to other S3 classes.
    • Storage Classes can be configured at the object level and a single bucket can contain objects stored in S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA.
  • Cost factor:
    • # of requests
    • Data transfer
  • S3 security features
    • Permissions
    • Versioning
    • Zone Replication
    • Backup
    • Encryption – client (at transit) + server side (at rest)
Class Explanation Cost Usage
Standard S3 Standard GP-Storage of frequently accessed data Highest Cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics.
Unknown & Changing Access S3 Intelligent-Tiering 2 Access Tiers:

  • Tier1: FA
  • Tier2: IA
Small monthly monitoring and auto-tiering fee Unpredictable access pattern data
S3 Standard-IA Long-lived, but less FA data, but requires rapid access when needed Low per GB storage price and per GB retrieval fee Backups, data store for disaster recovery files
S3 One Zone-IA 20% less than S3 Standard-IA. Storing secondary backup copies of on-premises data or easily re-creatable data
Archive S3 Glacier Long-term archive and digital preservation.

3 retrieval options (few mins to hrs)

  1. Expedited
  2.  Bulk
  3. Standard
Cheaper than on-premises solutions
S3 Glacier Deep Archive Long-term archive and digital preservation.

Data accessed once or twice in a year.

Alternative to magnetic tape systems.

Restored within 12 hours.

Lowest Highly-regulated industries, such as the Financial Services, Healthcare, and Public Sectors — that retain data sets for 7-10 years or longer to meet regulatory compliance requirements.

Screen Shot 2019-09-05 at 7.14.41 PM.png

EC2

Instance Types Components

  1. Family: General purpose, Compute, Memory and Storage optimized, Accelerated computing.
  2. Type: Subcategory of family. m4.large | m4.xlarge | t3a.nano
  3. vCPUs: # of virtual CPU.
  4. Memory (GiB): RAM size.
  5. Instance Storage: Local HDD (EBS)
  6. EBS/ Network optimized features
  7. Hardware Specifications
  8. Nitro-based Instances: A collection of AWS-built hardware and software components that enable high performance, availability, and security.
    • Nitro Components: 
      • Nitro hypervisor – A lightweight hypervisor that manages memory and CPU allocation and delivers performance that is indistinguishable from bare metal for most workloads.
      • Nitro card
        • Local NVMe storage volumes

        • Networking hardware support
        • Management
        • Monitoring
        • Security
      • Nitro security chip, integrated into the motherboard
  9. Instance Limits: There is a limit on the total number of instances (20) that you can launch in a region, and there are additional limits on some instance types.
  10. Pricing:
    • Buying option (On-demand, Reserved, Spot, Dedicated)
    • AMI
    • Instance type
    • Region
    • Data transfer in/out
    • Storage capacity.

RDS

AWS manages:

  • Server maintenance
  • OS install / patches
  • DB s/w install/ patches
  • DB backup
  • High availability and scaling
  • Read replicas

Pricing

  1. Pricing depends on engine, On-Demand or Reserved Instances EC2 instances, storage, backtrack and data transfer IO you select.
  2. RDS provides a selection of instance types optimized to fit different relational database use cases.
  3. RDS is free to try. Pay only for what you use. There is no minimum fee. You can pay for RDS using.

DynamoDB

Pricing

  1. Pricing for on-demand capacity mode: pay per request.
  2. Pricing for provisioned capacity mode: pay per prov capacity (1000 Rs, 100Ws)

SNS

  • Pricing depends on Publishes, Notification deliveries and Data transfer

Screen Shot 2019-09-06 at 7.15.31 AM.png

Cloudwatch

Screen Shot 2019-09-06 at 8.51.30 AM.png

AWS List

https://www.parkmycloud.com/aws-services-list/

AWS Visual Diagram

https://www.lucidchart.com/documents/view/703f6119-4838-4bbb-bc7e-be2fb75e89e5/eNbqbEM6f5NI

infra-img

VPC

Screen Shot 2019-09-05 at 10.14.22 AM.png

EC2

Screen Shot 2019-09-05 at 8.36.57 PM.png

Route 53

Screen Shot 2019-09-06 at 1.19.48 PM.png

ELB

Screen Shot 2019-09-08 at 8.39.38 AM.png

Elastic Beanstalk

Screen Shot 2019-09-08 at 9.42.30 AM.png

Control and Data Plane

Screen Shot 2019-09-09 at 3.04.32 PM.png

Shared Responsibility Model

Screen Shot 2019-09-12 at 8.21.22 PM.png

Other Services

Analytics
Athena S3 SQL
CLoudSearch Search in website and Apps
Kafka to build real time streaming application, pubsub model
QuickSight Business Analytics services, cost management
Data Pipeline Move data between AWS resources
Glue ETL tool for data in AWS
Application Integration
Step Function Visual workflow/state diagram of distributed application
Eventbridge serverless service bus that connect outside SaaS and AWS
MQ Managed Message Broker, active MQ
AppSync Sync distributed data across all platforms [GraphQL]
AR & VE
Sumerian AR, VR, and 3D experience
Cost Management
Cost Explorer Analyze AWS cost and usage
Budgets Set custom cost and usage budget, alert when exceed threshold
Reserve Instance Reporting Reporting tool for reserved resources (EC2, RDS, ESS, ElasticCache, Redshift)
Blockchain
Managed Blockchain Create and manage blockchain
QLDB Managed Ledger DB
Business Application
Alexa for Business For employees(day to day work)
Chime meeting, video and chat (zoom, slack)
WorkDocs online aws docs (like google docs)
WorkMail Like outlook
Compute https://aws.amazon.com/blogs/architecture/compute-abstractions-on-aws-a-visual-story/
Elastic container Registry Docker registry
EKS (Elastic Kubernetes Services) run managed Kubernetes (like autoscaling of EC2, it is for containers)
Fargate Compute engine for ECS and EKS
Lightsail POD. Virtual Private Server.  All in one VM for application development ( has server, db, lb)
Outposts Onpremise DC AWS, two variant: 1. VMWare cloud on AWS Outposts 2. AWS native variant of AWS Outposts
Serverless Application Repository github for serverless computing
VMWare Cloud Control plane, converting Vsphere based env to EC2 in cloud
Control Plane responsible for exposing the API and interfaces to define, deploy, and lifecycle containers
Data Plane responsible for providing capacity (as in CPU/Memory/Network/Storage) so that those containers can actually run and connect to a network
Customer Engagement
AWS Connect Support customer service
Pinpoint CRM tools, customer engagement platform
Management and Goverance
Control Tower manage multi aws account for compliance
Console mobile Application aws mgmt console for mobile
License manager Manage license
Managed Services automates common activities, such as change requests, monitoring, patch management, security, and backup services, and provides full-lifecycle services to provision, run, and support your infrastructure.
OpsWork configuration management service: Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers. OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.
System Manager Operation insights dashboard
Well Architected tool input requirements -> get architecture design
Networking:
Private Link Private connection between resources eliminating the internet but uses amazon network
Direct connect Connection between onpremise and aws
App Mesh monitor and control microservices running on
AWS. App Mesh standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure high-availability for your applications.
Cloud Map service discovery for cloud resources
Global Accelerator it is AWS GLOBAL network (private aws internet)
Transfer accelerator data arrives at an edge location, data is routed to Amazon S3 over an optimized network path.
Transit Gateway Inter VPC connection/on-premise, traffic routed by central transit gateway
Security
artifacts access reports and details of security controls
Certificate Manager SSL/TLS cerficate service for internal connected resourses
CloudHSM Hardware security model
Resource Access manager Resource share
Secret manager Key vault
security hub hub for security alerts (guardduty, macie, inspector will sent event alert to hub)
Storage
Amazon FSx for Lusture mount file system interface for high performance to process S3 data
Windows FSx for windows file server NTFS
AWS Backup backup everything centrally from aws resources
Snowball 50 TB to 80 TB, no compute
Snowball edge snowball with compute, 100TB
Snowball mobile 100 PB, no compute
storage gateway convert tape, file and volume storage from onpremise to S3 classes and EBS
Migration
AWS migration hub tracking the progress of migration
Application discovery service tool to discover/analyze on-premise resources and plan aws migration
DataSync on premise SAN storage to AWS (S3 or EBS)
Server Migration automatically replicate live server volumes to AWS and create Amazon Machine Images (AMI) as needed.
Transfer for SFTP transfer file using SFTP to S3
CloudEndure Migration Complete cloud migration (OS+server+db)
Desktop
WorkSpaces Virtual Desktop Infrastructure (VDI)
AppStream 2.0  application streaming service
ML
Comprehend relationships in text
Lex voice and text
polly text to voice
Rekognition image analyzer
Translate tranlate langs
Transcribe closed-caption
Elastic Inference running deep learning inference
 Forecast Forecast
Textract extracts text and data from scanned documents
Personalize individualized recommendations
Mobile
Amplify mobile applications
AppSync serverless back-end for mobile, web, and enterprise applications.

AWS Concepts

What is Cloud Computing?

  • On-demand delivery of compute power, database storage, applications, and other IT resources through a cloud services platform via the Internet with pay-as-you-go pricing.
  • A cloud services platform provides simple and rapid access to flexible and low-cost IT resources.
  • No upfront investments and no heavy lifting of managing the hardware, you can provision exactly the right type and size of computing resources.
  • AWS owns and maintains the network connected hardware, while organization provision and use what they need via a web application.

Six Advantages of Cloud Computing

  1. Trade Capital Expense Vs Variable Expense – Pay only when and how much consume the computing resources, instead of upfront heavy capital expenditure.
  2. Massive economies of scale – Because large # of customers is aggregated in the cloud, AWS can achieve cost-effective economies by scale.
  3. Stop guessing capacity – Eliminate guessing on your infrastructure capacity needs. You can scale up and down as required.
  4. Increase speed and agility – You reduce the time to make resources available to your developers from weeks to just minutes.
  5. Stop spending money running and maintaining data centers – Focus on projects that differentiate your business, not the infrastructure. 
  6. Go global in minutes – Deploy application in multiple regions around the world with lower latency and a better experience for customers.

Types of Cloud Computing

Cloud Computing Models

  1. Infrastructure as a Service (IaaS) contains the basic building blocks for cloud IT and typically provide access to networking features, computers (virtual or on dedicated hardware), and data storage space.
  2. Platform as a Service (PaaS) removes the need for organization to manage the underlying infrastructure (usually hardware and operating systems) and allows to focus on the deployment and management of applications. No resource procurement, capacity planning, software maintenance, patching, or any of the other undifferentiated heavy lifting involved in running application.
  3. Software as a Service (SaaS) provides with a completed product that is run and managed by the service provider. In most cases, people referring to Software as a Service are referring to end-user applications. With a SaaS offering you do not have to think about how the service is maintained or how the underlying infrastructure is managed; you only need to think about how you will use that particular piece of software.

Cloud Deployment Models

  1. Cloud – A cloud-based application is fully deployed in the cloud and all parts of the application run in the cloud.
  2. Hybrid – A hybrid deployment is a way to connect infrastructure and applications between cloud-based resources and existing resources in on-premise.
  3. On-premises – The deployment of resources on-premises, using virtualization and resource management tools, is sometimes called the “private cloud.” On-premises deployment doesn’t provide many of the benefits of cloud computing but is sometimes sought for its ability to provide dedicated resources. In most cases this deployment model is the same as legacy IT infrastructure while using application management and virtualization technologies to try and increase resource utilization.

Global Infrastructure

  • Global infrastructure helps customers to achieve lower latency and higher throughput, and ensures their data resides only in the AWS Region they specify.
  • The AWS Cloud infrastructure is built around:
    • AWS Regions is a physical location in the world where we have multiple Availability Zones.
    • Availability Zones consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities.
  • Both Regions and Zones are designed to have important cloud principles such as high availability, fault-tolerant, scalability, and elasticity.
  • High availability and Fault-tolerant (reliable and durable) are met with redundancy.
  • Scalability and Elasticity are met with managed increase/ decrease capacity.
  • AWS Cloud operates 20 Regions and 60 Availability Zones.
  • Each Availability Zone is isolated, but the AZ in a Region are connected through low-latency links.
    •  Sync replication – backup thru read replica.
    • Async replication – backup for disaster recovery.
  • AWS provides flexibility to place instances and store data within multiple regions as well as across multiple AZ within each AWS Region.
  • Each Availability Zone is designed as an independent failure zone. This means that Availability Zones are physically separated within a typical metropolitan region and are located in lower risk flood plains (specific flood zone categorization varies by AWS Region). In addition to discrete un-interruptable power supply (UPS) and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure.
  • Availability Zones are all redundantly connected to multiple tier-1 transit providers. (for sync replication).

Security and Compliance

Security

  • Customer don’t manage physical servers or storage devices, but they required to use software-based security tools to monitor and protect cloud resources.
  • As an AWS customer you inherit all the best practices of policies, architecture, and operational processes built to satisfy the requirements of customers.
  • The AWS Cloud enables a shared responsibility model. While AWS manages security of the cloud, customer responsible for security in the cloud.
  • AWS provides you with guidance and expertise through online resources, personnel, and partners. AWS provides you with advisories for current issues.
  • AWS provides security-specific tools and features across network security, configuration management, access control, and data encryption.
  • Finally, AWS environments are continuously audited, with certifications from accreditation bodies across geographies and verticals. Take advantage of automated tools for asset inventory and privileged access reporting.

Screen Shot 2019-09-08 at 2.11.33 PM.png

Benefits of AWS Security

  • Keep Your Data Safe: The AWS infrastructure puts strong safeguards in place to help protect your privacy. All data is stored in highly secure AWS data centers.
  • Meet Compliance Requirements: AWS manages dozens of compliance programs in its infrastructure. This means that segments of your compliance have already been completed.
  • Save Money: Cut costs by using AWS data centers. Maintain the highest standard of security without having to manage your own facility
  • Scale Quickly: Security scales with your AWS Cloud usage. No matter the size of your business, the AWS infrastructure is designed to keep your data safe.
  • 7 design principles: 
    • Implement a strong identity foundation:
      • Implement the principle of least privilege and separation of duties with appropriate authorization for each interaction with your AWS resources. Centralize privilege management and reduce or even eliminate reliance on long-term credentials.
    • Enable traceability:
      • Monitor, alert, and audit actions and changes to your environment in real time. Integrate logs and metrics with systems to automatically respond and take action.
    • Apply security at all layers:
      • Rather than just focusing on protection of a single outer layer, apply a defense-in-depth approach with other security controls. Apply to all layers (e.g., edge network, VPC, subnet, load balancer, every instance, operating system, and application).
    • Automate security best practices:
      • Automated software-based security mechanisms improve your ability to securely scale more rapidly and cost effectively. Create secure architectures, including the implementation of controls that are defined and managed as code in version-controlled templates.
    • Protect data in transit and at rest:
      • Classify your data into sensitivity levels and use mechanisms, such as encryption, tokenization, and access control where appropriate.
    • Keep people away from data:
      • Create mechanisms and tools to reduce or eliminate the need for direct access or manual processing of data. This reduces the risk of loss or modification and human error when handling sensitive data.
    • Prepare for security events:
      • Prepare for an incident by having an incident management process that aligns to your organizational requirements. Run incident response simulations and use tools with automation to increase your speed for detection, investigation, and recovery.

Compliance

  • Compliance responsibilities will be shared and by tying together governance-focused, audit-friendly service features with applicable compliance or audit standards, AWS Compliance enablers build on traditional programs.
  • The following is a partial list of assurance programs with which AWS complies:
    • SOC 1/ISAE 3402, SOC 2, SOC 3
    • FISMA, DIACAP, and FedRAMP
    • PCI DSS Level 1
    • ISO 9001, ISO 27001, ISO 27017, ISO 27018.
  • 3 Components:
    • Risk Management (identify, manage and control risks)
    • Control Environment (policy, process, steps to secure resources)
    • InfoSecurity (CIA of customer data)

AWS Well-Architected

  • Developed best-practices through lessons learned by working with customers.
  • Well-Architected Framework has been developed to help cloud architects build secure, high-performing, resilient, and efficient infrastructure for their apps.
  • Framework: 5 design pillars of cloud:
    • Operational Excellence: Focuses on running and monitoring systems to deliver business value, and continually improving processes and procedures.
      • Key topics: managing and automating changes, responding to events, and defining standards to successfully manage daily operations.
      • Automate
      • Respond to events
      • Define standards
    • Security: Focuses on protecting information & systems.
      • Key topics: CIA, privilege management, protecting systems, and establishing controls to detect security events.
      • IAM
      • Detective controls
      • Infrastructure protection
      • Data protection
      • Incident response
      • Design Principles
        • Strong ID foundations
        • Implement security at all layers
        • Enable traceability
        • Apply principle of least privilege
        • Focus of securing your system
        • Automate
    • Reliability: Focuses on the ability to prevent, and quickly recover from failures to meet business and customer demand.
      • Key topics: foundational elements around setup, cross project requirements, recovery planning, and how we handle change.
      • Recover from failures and meet demand
      • Apply best practices in: Foundations, Change and failure management
      • Anticipate, respond and prevent failures
      • Design Principles
        • Test recovery procedure
        • Automate recover
        • Scale horizontally
        • Stop guessing capacity
        • Manage change in automation
    • Performance Efficiency: Focuses on using IT and computing resources efficiently.
      • Key topics: selecting the right resource types and sizes based on workload requirements, monitoring performance, and making informed decisions to maintain efficiency as business needs evolve.
      • Select customizable solutions
      • Continuously innovations
      • Monitor
      • Consider trade-offs
      • Design Principles
        • Democratize advance techs (build vs buy service)
        • Go global
        • Use serverless architecture
        • Experiment often
        • Have mechanical sympathy
    • Cost Optimization: Focuses on avoiding un-needed costs.
      • Key topics: understanding and controlling where money is being spent, selecting the most appropriate and right number of resource types, analyzing spend over time, and scaling to meet business needs without overspending.
      • Use cost-effective resource
      • Match supply with demand
      • Increase expenditure awareness
      • Optimize over time
      • Design Principles
        • Adopt consumption model
        • Measure overall efficiency
        • Stop spending DC ops
        • Check expenditures
        • Use managed services

Design principles:

  1. Stop guessing your capacity needs: scale up and down automatically.
  2. Test systems at production scale: simulate prod-scale test env on demand.
  3. Automate to make architectural experimentation easier: replicate systems at low cost and avoid the expense of manual effort.
  4. Allow for evolutionary architectures: allows systems to evolve over time so that businesses can take advantage of innovations as a standard practice.
  5. Drive architectures using data: collect data on how your architectural choices affect the behavior of your workload. Helps fact-based decisions on data. 
  6. Improve through game daysSimulate events in production. Help to  understand where improvements can be made and can help develop organizational experience in dealing with events.

Block Cipher Mode of Operation

  • In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity.
  • A block cipher by itself is only suitable for the secure cryptographic transformation (encryption or decryption) of one fixed-length group of bits called a block.
  • A mode of operation describes how to repeatedly apply a cipher’s single-block operation to securely transform amounts of data larger than a block.
  • Initialization Vector (IV)
    • Most modes require a unique binary sequence, often called an initialization vector (IV), for each encryption operation.
    • The IV has to be non-repeating and, for some modes, random as well.
    • The initialization vector is used to ensure distinct ciphertexts are produced even when the same plaintext is encrypted multiple times independently with the same key.
  • Block cipher modes operate on whole blocks and require that the last part of the data be padded to a full block if it is smaller than the current block size.
  • Historically, encryption modes have been studied extensively in regard to their error propagation properties under various scenarios of data modification.
  • Later development regarded integrity protection as an entirely separate cryptographic goal.
  • Some modern modes of operation combine confidentiality and authenticity in an efficient way, and are known as authenticated encryption modes.

Modes of Operations

 

Electronic Codebook (ECB) – worst

ECB encryption.svg

ECB decryption.svg

  • Encryption parallelizable: Yes
  • Decryption parallelizable: Yes
  • Random read access: Yes

Cipher Block Chaining (CBC) – widely used

CBC encryption.svg

CBC decryption.svg

  • Encryption parallelizable: No
  • Decryption parallelizable: Yes
  • Random read access: Yes

Cipher Feedback (CFB)

CFB encryption.svg

CFB decryption.svg

  • Encryption parallelizable: No
  • Decryption parallelizable: Yes
  • Random read access: Yes

Output Feedback (OFB)

OFB encryption.svg

OFB decryption.svg

  • Encryption parallelizable: No
  • Decryption parallelizable: No
  • Random read access: No

Counter (CTR)

CTR encryption 2.svg

CTR decryption 2.svg

  • Encryption parallelizable: Yes
  • Decryption parallelizable: Yes
  • Random read access: Yes

Cryptographic Primitives

Integrity

  • Integrity is used to make sure that nobody in between the two partners has changed the message.
  • Integrity means that on the route from Alice to Bob, the message has not changed in between.
  • Hashing technique can be used for this purpose to calculate the hash of message and send along with the original message.
  • Hashing can be achieved through Message Digest (just plain hash) or HMAC (using symmetric key)  or Digital Signature (using asymmetric key) of message.

Confidentiality

  • Confidentiality is used to make sure that nobody in between the two partners are able to read the message that has been sent.
  • Encryption technique can be used for this purpose to encrypt the message with a key (secret or private key).

Authenticity

  • Authenticity would mean that messages received by Alice are actually sent by Bob.
  • Authenticity is used to make sure that one partner really communicating with the other partner you want to communicate.
  • Pre-shared key technique can be used for this purpose that are configured between partners.

Non-repudiation

  • If the recipient passes the message and the proof to a third party and the third party can be confident that the message originated from the sender.
  • More precisely, Non-repudiation is the assurance that someone cannot deny something. i.e., once sender sends the message to the receiver, the sender can’t say that I didn’t send this message to the receiver.
    • Non-repudiation is about Alice showing to Bob a proof that some data really comes from Alice, such that not only Bob is convinced, but Bob also gets the assurance that he could show the same proof to Charlie, and Charlie would be convinced, too, even if Charlie does not trust Bob.

Difference b/w Integrity, Authenticity & Authentication and Non-repudiation?

  • Authentication is NOT Authenticity.
    • Authentication is establishing that Alice talking to Bob, while authenticity is establishing that the message actually came from Bob.
  • Authenticity would imply integrity, but integrity wouldn’t imply authenticity.
    • For example, the message may retain its integrity, but it could have been sent by Charlie instead of Bob to Alice.
    • If Symmetric Key (pre-shared key) used then you’ll achieve Authenticity.
  • Non-repudiation would imply authenticity, but authenticity wouldn’t imply non-repudiation
    • For example, the message may show who sends it (authenticity), but the sender may deny it unless Public Key crypto used.
    • If Asymmetric Key used then you’ll achieve Non-repudiation.
  • Trust hierarchy of is like this (highest to lowest):
    • Non-repudiation (provides via Digital Signature – Asymmetric Key)
    • Authenticity (provides via MAC – Symmetric Key)
    • Integrity (provides via MD or MAC – Symmetric Key)

Does SSL/TLS provide non-repudiation service?

  • There is no non-repudiation for regular TLS packets.
  • When Alice tells Bob “Let’s kill the president!” and Bob goes to tell the police, then Alice can always say: “I never said that.”
    • So Alice just repudiated the previous statement.
    • Regular TLS there is no way for Bob to prove by himself that Alice has in fact repudiated her earlier statement.
    • So no, regular TLS does NOT provide non-repudiation.
  • TLS packets are MACed and not Digital Signatured
    • Since Alice and Bob both know the secret-MAC-key either one of them can generate MAC-values for whatever message they like.
    • A MACed message says NOTHING about authorship of either Alice or Bob.
    • So if Bob goes and tells Charlie: Here’s something evil Alice said!, then Alice can rightfully say: You could have written that yourself!.

Authenticated Encryption

  • Authenticated Encryption (AE) and authenticated encryption with associated data (AEAD) are forms of encryption which simultaneously assure the confidentiality and authenticity of data.
  • AE mode implementation would provide the following functions:
    • Encryption
      • Input:
        • Plaintext
        • Key
        • Header (optional) in plaintext that will not be encrypted, but will be covered by authenticity protection.
      • Output:
        • Ciphertext
        • Authentication Tag (MAC).
    • Decryption
      • Input:
        • Ciphertext
        • Key,
        • Authentication Tag (MAC)
        • Header (optional)
      • Output:
        • Plaintext, or an Error if the MAC does not match the supplied ciphertext.
    • The header part is intended to provide authenticity and integrity protection for networking or storage metadata for which confidentiality is unnecessary, but authenticity is desired.
  • Approaches to AE
    • Encrypt-then-MAC (EtM)
    • Encrypt-and-MAC (E&M)
    • MAC-then-Encrypt (MtE)

Credits: https://crypto.stackexchange.com.

Docker

Containers Vs. VMs

Image result for docker vs vm

Docker Image

  • It’s a template file from which Docker containers can be created.
  • Docker images are created with the build command, and this produces a container that starts when it is run.

Docker Hub

  • It’s a centralized resource for docker images where you to link to code repositories, build your images and test them, stores manually pushed images, and links to Docker cloud so you can deploy images to your hosts.
  • Docker images are stored in the Docker registry such as the public Docker registry (registry.hub.docker.com) as these are designed to be constituted with layers of other images, enabling just the minimal amount of data over the network.

Docker Container

  • Docker container is an isolated processes, which contains application and its dependencies, running on its own user space on a shared kernel (OS).
  • Docker containers are created out of Docker images.
  • Docker containers are basically runtime instances of Docker images.

Docker Client

  • Docker client is a CLI tool, which is nothing but HTTP API wrapped program.

Docker Daemon

  • When you use docker runcommand to start up a container, your docker client will translate that command into HTTP API call, sends it to docker daemon, Docker daemon then evaluates the request, talks to underlying OS and provisions your container.

Image result for Docker Daemon

Dockerfile 

  • Dockerfile is nothing but a set of build instructions that have to be passed on to Docker itself, so that it can build images automatically reading these instructions from that specified Dockerfile.
  • A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.
FROM ubuntu:latest

RUN apt-get update
RUN apt-get install -y python python-pip wget
RUN pip install Flask

ADD hello.py /home/hello.py

WORKDIR /home

Docker Build

  • Build an image from a Dockerfile.
$ docker build -t "simple_flask:dockerfile" .

Docker Run

  • Run a command in a new container by using the docker image.
$ docker run -p 5000:5000 simple_flask:dockerfile python hello.py

dockerfile

Docker Inspect

  • Return low-level information on Docker objects
  • Example: Get an instance’s IP address

$ docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $INSTANCE_ID

Docker pull, push, restart, stop, start

  • docker pull
    • Pull an image or a repository from a registry.
  • docker push
    • Push an image or a repository to a registry.
  • docker restart
    • Restarts the container.
  • docker start
    • Starts the stopped container.
  • docker stop
    • Stops the running container.

Docker Compose

  • Docker Compose is used to run multiple containers via single file called docker-compose.yml.
  • For example, suppose you had an application which required product-service and website, you could create one file which would start both the containers as a service without the need to start each one separately.
version: '3'
services:
  product-service:
    build: ./product
    ports:
     - "5001:80"
  website:
    image: php:apache
    ports:
     - "5000:80"
    depends-on:
     - product-service

Docker-compose start, up and run

All uses instructions from docker-compose.yml.

  • docker-compose start
    • Helps to restart the stopped containers.
  • docker-compose up
    • Helps to (re)-create containers and starts the new containers.
  • docker-compose run
    • Helps to build the images, (re)-create containers, starts and attaches to containers for a service.

docker_events

Docker Swarm

  • It’s an automatic orchestration tool (Docker cluster implementation) for Docker containers to scale your application running in clusters.
  • Docker Swarm turns a pool of Docker hosts into a single and virtual Docker host.
  • It serves the standard Docker API or any other tool that can already communicate with a Docker daemon can make use of Docker Swarm to scale in a transparent way to multiple hosts.

 

PKI

Public Key Infrastructure is a framework for a secure (i.e., Confidentiality and Authentication/ Integrity) communication between parties.

  • Asymmetric Keys
    • For Confidentiality
  • Digital Certificates
    • For Authentication/ Integrity
  • Certificate Authority (CA)
    • Who vouches the Digital Certificate.
  • Registration Authority (RA)
    • Who is responsible for Digital Certificate registration.
  • Certificate Revocation List (CRL)
    • Storage list contains inactive Digital Certificates.
    • Revoke types: Expired, Revocation, Suspended
  • Recovery Agent
    • Who is authorized to recover the lost private key.
  • Key Escrow
    • Key Archival System
    • Example: Government wants all Private key storage access in the event of any wrongdoing.

Credits: https://www.youtube.com/watch?v=t0F7fe5Alwg

RDBMS Quickie

Data Definition Language (DDL)

  • Create Table
  • Alter Table
    • Add a column
    • Modify a column
    • Drop a column
    • Rename a column
  • Drop Table
    • Whole table and records will be removed.
  • Truncate Table
    • Remove all records, but table structure still remains.

Constraints

  • NOT NULL
    • Adheres to not null value in the column.
    • Column-level constraint.
  • UNIQUE
    • Adheres to non-duplicate value in the column.
    • Both column and table-level constraint.
    • When it’s a table-level, it will check the combinations of values to be unique across all columns in the table.
  • PRIMARY KEY
    • Adheres to NOT NULL and UNIQUE constraints together.
    • A primary key associate to only a single column in a table.
    • Composite key: A combination of two or more columns (defined as a primary key) in a table that can be used to uniquely identify each row in the table when the columns are combined uniqueness is guaranteed, but when it taken individually it does not guarantee uniqueness.
  • CHECK
    • Adheres to check constraint.
    • For example: Gender column in a Citizen table has conditional values: M/F only, if you enter other value ‘A’, it will throw error.
  • FOREIGN KEY
    • Adheres to referential integrity constraint.
    • It’s a connector key which helps to fetch additional records from same or other table.
    • Make a relationship between two tables based on the primary keys. The primary key on one table becomes the foreign key for other.

Data Manipulation Language (DML)

 

ORDER BY

The ORDER BY keyword is used to sort the result-set in ascending or descending order.

SELECT * FROM Customers
ORDER BY Country;

GROUP BY

The GROUP BY statement is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or more columns.

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;

Indexes

  • Indexes are special lookup tables that the database search engine can use to speed up data retrieval.
  • Simply put, an index is a pointer to data in a table.
  • An index in a database is very similar to an index in the back of a book.

SQL Joins

Image result for sql joins

 

Normalization

  • Normalization is the process of structuring a relational database into normal forms (NF) in order to:
    • Reduce data redundancy (repetition)
    • Ensure data dependencies (data is logically stored)
    • Maintain data integrity
  • 1st NF
    • Rule 1: Single atomic value in each column
      • Example: FirstName column must have first name only, mustn’t have DoB value in it.
    • Rule 2: Each row must have a unique identifier
      • Example: Each row must have unique id to manage the data.
  • 2nd NF
    • Rule 1: 1st NF must be adhered.
    • Rule 2: No partial dependency. i.e., Every non-prime attribute of the table is functionally dependent on the whole of every primary key/ candidate key.
      • Functional Dependency is a relationship that exists when one attribute uniquely determines another attribute. If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X->Y, which specifies Y is functionally dependent on X.
      • Partial dependency implies is a situation where a non-prime attribute (an attribute that does not form part of the primary key/ candidate key) is functionally dependent to a part of a primary key/ candidate key.
        • Example: Product (Id, Name, Price)
          • Candidate Key: Id, Name
          • Non prime attribute: Price
          • Price attribute only depends on only Id attribute, which is a subset of candidate key, not the whole candidate key (Id, Name) key . It is called partial dependency.
          • So we can say that Id-> Price is a partial dependency.
        • Solution: To remove Partial dependency, we can divide the table, remove the non-prime attribute which is causing partial dependency, and move to separate table and associate with a primary key/ candidate key.
          • Product (Id, Name)
          • Price (Id, Price)
  • 3rd NF
    • Rule 1: 2nd NF must be adhered.
    • Rule 2: No Transitive Dependency.
      • Transitive Dependency is a functional dependency which holds by virtue of transitivity. A transitive dependency can occur only in a relation that has three or more attributes. Let A, B, and C designate three distinct attributes (or distinct collections of attributes) in the relation. Suppose all three of the following conditions hold:
        • A → B
        • It is not the case that B → A
        • B → C
        • Then the functional dependency A → C is a transitive dependency.
        • Example: BookAuthor (BookName, Genre, AuthorName, AuthorNationality)
          • {BookName} → {AuthorName}
          • {AuthorName} does not → {BookName}
          • {AuthorName} → {AuthorNationality}
          • Therefore {BookName} → {AuthorNationality} is a transitive dependency.
          • Transitive dependency occurred because a non-key attribute (AuthorName) was determining another non-key attribute (AuthorNationality).
          • Solution: To remove Transitive dependency, we can divide the table, remove the non-prime attribute which is causing transitive dependency, and move to separate table and associate with a primary key.
            • Book (BookName, Genre, AuthorName)
            • Author (AuthorName, AuthorNationality)

Image result for normalization in sql

SQL Performance Tips

  1. Check Indexes
    • Have indexes on all fields used in the WHERE and JOIN portions of the SQL statement.
    • More indexes could slow down write operations (such as INSERT / UPDATE statements).
  2. Avoid wrapping indexed columns with functions
    • SELECT count(*) FROM us_hotdog_purchases WHERE YEAR(purchase_time) = ‘2018’
    • This function call will prevent the database from being able to use an index for the purchase_time column search, because we indexed the value of purchase_time, but not the return value of YEAR(purchase_time).
    • SELECT count(*) FROM us_hotdog_purchases WHERE purchased_at >= ‘2018-01-01’ AND purchased_at < ‘2019-01-01’
  3. Avoid OR conditions
    • SELECT count(*) FROM fb_posts WHERE username = ‘Mark’ OR post_time > ‘2018-01-01’
    • Having an index on both the username and post_time columns might sound helpful, but in most cases, the database won’t use it, at least not in full. The reason will be the connection between the two conditions – the OR operator, which makes the database fetch the results of each part of the condition separately.
    • An alternative way to look at this query can be to ‘split’ the OR condition and ‘combine’ it using a UNION clause. This alternative will allow you to index each of the conditions separately, so the database will use the indexes to search for the results and then combine the results with the UNION clause.SELECT …
      FROM …
      WHERE username = ‘Mark’
      UNION
      SELECT …
      FROM …
      WHERE post_time > ‘2018-01-01’
  4.  Avoid sorting with a mixed order
    • Consider this query, which selects all posts from Facebook and sorts them by the username in an ascending order, and then by the post date in a descending order.
    • SELECT username, post_type FROM fb_posts ORDER BY username ASC, post_type DESC
  5. Avoid LIKE searches with prefix wildcards
    • SELECT * FROM fb_posts WHERE username like ‘%Mar%’
    • Having a wildcard ‘%’ at the beginning of the pattern will prevent the database from using an index for this column’s search.
  6. Limit Size of Your Working Data Set
    • A classic example is when a query initially worked well when there were only a few thousand rows in the table. As the application grew the query slowed down.
    • The solution may be as simple as restricting the query to looking at the current month’s data.
  7. Only Select Fields You Need
    • Extra fields often increase the data load in retrieval.
    • Data retrieval increase the network IO and disk IO.
  8. Remove Unnecessary Tables
    • Same as the reasons for removing fields not needed in the select statement.
  9. Remove OUTER JOINS
    • One solution is to remove OUTER JOINS by placing placeholder rows in both tables.
    • Say you have the following tables with an OUTER JOIN defined to ensure all data is returned:
CUSTOMER_ID CUSTOMER_NAME
1 John Doe
2 Mary Jane
3 Peter Pan
4 Joe Soap
CUSTOMER_ID SALES_PERSON
NULL Newbee Smith
2 Oldie Jones
1 Another Oldie
NULL Greenhorn
  • The solution is to add a placeholder row in the customer table and update all NULL values in the sales table to the placeholder key.
CUSTOMER_ID CUSTOMER_NAME
0 NO CUSTOMER
1 John Doe
2 Mary Jane
3 Peter Pan
4 Joe Soap
CUSTOMER_ID SALES_PERSON
0 Newbee Smith
2 Oldie Jones
1 Another Oldie
0 Greenhorn
  • Not only have you removed the need for an OUTER JOIN you have also standardized how sales people with no customers are represented. Other developers will not have to write statements such as ISNULL (customer_id, “No customer yet”).

 

Java Quickie

JVM, JRE and JDK

  • JVM is the java virtual machine that runs the java byte code.
  • JVM is hardware platform specific (mac, windows, linux) and byte codes are the machine language of JVM.
  • JRE is java runtime environment that is sufficient to run the program.
  • JRE = JVM + library files.
  • JDK is java development kit, required to write, compile and run a program.
  • JDK = JRE + tools needed to develop a java program.

Memory Allocation

  • Each time object is created in Java it is stored in heap memory.
  • Primitive variables and local are stored in stack, member variables (within object) in heap.
  • In multithreading each thread will have its own stack but will share same heap.
  • Methods and variables are pushed to the stack when a method is invoked and stack pointer is decremented when call is completed.
  • 32 bit OS can’t use more than 4GB RAM for java application. 64 bit use more memory for same object, almost twice.
  • Primitive int uses 4 times less memory than Integer.

The below table gives an idea of various datatypes and range of values it can hold.

Datatypes and Rage Allocation

OOPS

  • Class: A class is a blueprint or prototype that defines the variables and the methods. For example:
Class: Car
Data members or objects: color, type, model, etc.
Methods: stop, accelerate, cruise.
  • Object: An object is a realization of a class. Like in the above example MyCar is an object of the class Car.
  • Variable: can be local, instance and static.
    • Local variables are declared inside the body of a method.
    • Instance variables are declared outside method. They are object specific.
    • Static variables are initialized only once and at the start of program execution. Static variables are initialized first.
  • Method: methods are various functionalities, its nothing but set of code which is referred to by name and can be called (invoked) at any point in a program. You can pass multiple values to a method and it returns value(s).
  • Package: A Package is a collection of related classes. It helps organize classes into a folder structure and make it easy to locate and reuse them.
package com.example;
class Car {
    String color = "black"; //instance variable  
    static int i = 0; // static variable 
    void accelerate() { 
        int speed = 90; //local variable
    }
}

OOP Principles

1. Abstraction

  • Abstraction is selecting data from a larger pool to show only the relevant details to the object.

2. Encapsulation

  • Encapsulation is wrapping data (variables) and functionality (methods) together as a single unit called “class”. It is a blueprint or a set of instruction.

3. Polymorphism

  • Polymorphism is an ability to take more than one form.
  • Two types:
    • Compile-time polymorphism: aka Method overloading (i.e., binding knows at compile-time).
    • Runtime-time polymorphism: aka Method overriding (i.e., binding knows at runtime-time)

4. Inheritance

Inheritance is a mechanism in which one class acquires the property of another class. For example, a child inherits the traits of their parents.

5. Composition

  • Based on “OWNS” relationship.
  • For example: Car “OWNS” the wheel. Both can’t exist independently.
  • Composition are achieved by implementing interface.

6. Aggregation

  • Based on “HAS-A” relationship.
  • For example: Company and Employees. Once company goes, employees can still exist. Company “HAS-A” employees.
  • Aggregations are achieved by extending a class.

7. Association

  • Relationship between objects.
  • Includes both Composition and Aggregation.
  • Aggregations are generally achieved by extending a class and composition by implementing interface.

Class Loading

  • Very first class is loaded using static main() method and then subsequent class are loaded.
  • Two types:
    • Static Class Loading:
      • Classes are statically loaded using new operator.
      • NoClassDefinationFoundException is thrown if class reference is not found during static class loading.
    • Dynamic Class Loading: 
      • Dynamic class loading is programmatically invoking class at run time.
      • E.g: Class.forName(String ClassName). ClassNotFoundException is thrown for dynamic class loading.

Abstract Class

  • Abstract class has executable methods and abstract methods.
  • An abstract class cannot be instantiated in can only be subclassed.
  • All abstract methods must be defined in subclass, else the subclass should be abstract.

Interface

  • Interface has no implementation code and all methods are abstract i.e. all methods are only declared and none are defined.
  • Interface cannot be instantiated, it can only be implemented by other classes or extended by other interfaces.
  • Interface variables are final and static; interface methods are public and abstract by default.
  • A class can implement any number of interfaces, but can extend only one abstract class.

Constructor

  • The sole purpose of having Constructors is to create an instance of a class. They are invoked while creating an object of a class.
  • If a constructor with arguments has been defined in a class, you can no longer use a default no-argument constructor — you have to write one.
  • Constructor has same name as class.
  • Once constructor can be called from other using ‘this’ syntax, this means this object.
  • Java provides default constructor.
  • Private constructor:
    • Prevent class from being explicitly instantiated.
    • Object can be constructed, but internally.
    • Used for singleton.
  • Question: Can constructors be synchronized in Java?
    • No. Java doesn’t allow multi thread access to object constructors so synchronization is not even needed.
  • Question: Are constructors inherited? Can a subclass call the parent’s class constructor?
    • You cannot inherit a constructor.
    • By “super” keyword we can call the parents class constructor.

Static

  • Static variables or method belong to the class.
  • Only have one copy i.e. when we want to create variable or method that is shared by all objects of the class.
  • Static is used for variables, methods and block.
  • Static variable or method is initialized once before instance variable.
  • Static method can access only static data.
  • Static method cannot refer “this” or “super”.
  • Static method can only call other static methods.
  • “main()” method is static, because it must be accessible for an application to run before any instantiation takes place.
  • Constructor cannot be static, because compiler will treat it as a method.
  • Constructor is used to initialize new object whereas static is opposite of it.
  • Static variable is loaded first and then static block, although the sequence does matters. Static methods are loaded in the end.
  • Hierarchy is:
    • Static parent → Static child → Instance parent → Constructor parent → Instance child → Constructor child.
  • While overriding static method, compiler doesn’t give any error and runs fine but it’s not overriding it is called hiding, because we won’t get the benefits of run time polymorphism, because static associated with class and not with object.

Final, Finalize and Finally

  • Final:
    • Final keyword is used if we don’t want to change its value.
    • Final class cannot be extended.
    • Final method cannot be overridden.
    • Final variables are equivalent to constants.
  • Finally:
    • Finally block is called in all cases for a try catch block, used to release system resources like connections, statements etc. We will discuss try, catch and finally blocks in detail.
  • Finalize:
    • Finalize method helps garbage collection, this method is invoked before an object is discarded by garbage collector. Calling Finalize method doesn’t guarantees GC will run.

Image result for java exceptions

Object Class

  • Every class has Object as super class.
  • It has the following non-final methods:
    • equal()
    • hashCode()
    • toString()
    • clone()
    • finalize()
  • It has the following final methods:
    • wait()
    • notify()
    • notifyAll()
    • getClass()

Equals and HashCode

  • equals() must check for equality of objects wrt to unique contents of objects.
  • hashCode() must be defined with unique object contents.
  • Their contract:
    • Two equal objects must have a same hash codes.

Clone

  • Clone method is used to copy an object.
  • Clone method has protected access modifier.
  • To call the clone method, the object must implement Clonable interface, else it will throw CloneNotSupportedException.
  • Clonable interface is markers interface i.e. no methods defined interface. They just tell the compiler that it needs to be treated differently.
  • The advantage of having clonable is we can clone only those objects that allow us to clone.
  • Shallow Copy
    • Only the memory address is copied i.e. same object is shared.
  • Deep Copy
    • The object is created and dynamically new memory is allocated.

Primitive and Wrapper Type

  • A variable of a primitive type directly contains the value of that type.
    • Java has eight primitive types: byte, short, int, long, char, boolean, float and double.
  • A Wrapper class is a class whose object wraps or contains a primitive data types.
    • When we create an object to a wrapper class, it contains a field and in this field, we can store a primitive data types and various other supporting, operational methods.
    • It is slower to use the Object wrappers for primitives than just using the primitives.
    • Each of Java’s eight primitive data types has a class dedicated to it like Byte, Short, Integer, Long, String, Boolean, Float and Double.

Autoboxing and Unboxing

  • Java 1.5 compiler provides automatic conversion of primitive datatype to wrapper type, this is known as Autoboxing and the reverse is Unboxing.
  • Compiler internally uses valueOf() and intValue() for the same.

Casting

  • Assigning a value to other primitive value is casting.
    • byte → short → int → long → float → double
  • Upcasting is possible.
    • int i = 5; long j = i;
  • Downcasting is not possible, needs explicit casting:
    • long j = 5;
    • int i = j; (THIS IS WRONG, it will give classCastException)
    • int i = (int) j; // explicit casting required
  • int to String casting is not possible.

Immutability

  • A class object’s state cannot be changed once it is instantiated is called Immutable.
  • String and all wrapper classes and enum class are example of immutable class.
  • Immutable classes are inherently thread-safe.
  • This is how we can make a class immutable:
    • Ensure class cannot be overridden — make class final.
    • All its fields are private.
    • No setter methods: Do not provide any method that can change the state of the object.
    • Use defensive copy or clone.
  • BigDecimal is technically not immutable as its not final class.

String, StringBuffer and StringBuilder

  • String is immutable; we cannot modify a String object.
    • Every time we assign a new value, a new String object is created in stack and pointer points to the new object.
  • String pool (String intern pool) is a special storage area in Java heap.
    • When a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object and returning its reference.
    • intern() method is reference value of a String i.e. the address.
      • So S1.intern() == S2.intern(), only if S1.equals(S2) is true.
  • Question: Why is String immutable in Java?
    • String Pool: When a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object. If string is not immutable, changing the string with one reference will lead to the wrong value for the other references.
    • To Cache its Hashcode: If string is not immutable, One can change its hashcode and hence not fit to be cached.
    • Security: String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Making it mutable might possess threats due to interception by the other code segment.
  • String Comparison:
String a = “abcd”;
String b = “abcd”;
System.out.println(a == b); // True because of String intern pool.
System.out.println(a.equals(b)); // True
String c = new String(“abcd”);
String d = new String(“abcd”);
System.out.println(c == d); // False
System.out.println(c.equals(d)); // True
  • == checks the memory locations, whereas equals() checks value.
  • Using String constructor creates an extra unnecessary String object in the memory. Therefore, double quotes should be used if you just need to create a String, which will be interned internally.
  • String provides several methods concat(), trim(), substring() and replace().
  • StringBuffer is mutable and synchronized.
  • StringBuilder is mutable and non-synchronized.
  • equals() and hashCode() contract doesn’t work for StringBuffer and StringBuilder because objects are mutable in nature.

Serialization

  • The process of saving the state of an object by saving it to sequence of bytes and rebuilding it back from bytes is called serialization.
  • Can be done by implementing serializable interface, it’s a marker interface.
  • Fields marked as transient cannot be serialized.
  • serialVersionUID is adding a version number to make sure the serialized object is not changed while getting back as deserialized.
    • Two methods: writeObject() and readObject().
  • Externalization does custom serialization, where we can decide what to store in stream
    • Two methods: readExternal() and writeExternal().

Comparator and Comparable

  • Comparable interface allows itself to compare with another object uses CompareTo() method.
  • Comparator interface is used to compare two different objects. Uses Compare() method.
  • Method syntax:
CompareTo(Object obj)
Compare(Object obj1, Object obj2)
  • Comparator gives more control.
  • Collections.sort(list) for Comparable and Collections.sort(list, new comparatorObject()) for comparator.
  • Comparable is implemented by a class in order to be able to comparing object of itself with some other objects.
class HDTV implements Comparable {
private int size;
private String brand;
public HDTV(int size, String brand) {
this.size = size;
this.brand = brand;
}
// .. getters & setters
@Override
public int compareTo(HDTV tv) {
if (this.getSize() > tv.getSize())
return 1;
else if (this.getSize() < tv.getSize())
return -1;
else
return 0;
}}
public class Main {
public static void main(String[] args) {
HDTV tv1 = new HDTV(55, “Samsung”);
HDTV tv2 = new HDTV(60, “Sony”);
if (tv1.compareTo(tv2) > 0) {
System.out.println(tv1.getBrand() + “ is better.”);
} else {
System.out.println(tv2.getBrand() + “ is better.”);
}
}}
  • In some situations, you may not want to change a class and make it comparable.
class SizeComparator implements Comparator {
@Override
public int compare(HDTV tv1, HDTV tv2) {
int tv1Size = tv1.getSize();
int tv2Size = tv2.getSize();
if (tv1Size > tv2Size) {
return 1;
} else if (tv1Size < tv2Size) {
return -1;
} else {
return 0;
}
}}
public class Main {
public static void main(String[] args) {
HDTV tv1 = new HDTV(55, “Samsung”);
HDTV tv2 = new HDTV(60, “Sony”);
HDTV tv3 = new HDTV(42, “Panasonic”);
ArrayList al = new ArrayList();
al.add(tv1);
al.add(tv2);
al.add(tv3);
Collections.sort(al, new SizeComparator());
for (HDTV a : al) {
System.out.println(a.getBrand());
}
}}

 

Collections

Collection is any data structure where objects are stored and iterated over. Data Structure in itself is a huge topic and I will try to cover it in a separate series. Here, we will try to cover some basic data structures in java collection libraries.

  • Collection is an interface from which set, list and queue extends.
  • Collections Class holds static utility methods to use with collections.
  • Arrays are the fastest as compared to ArrayList or Vector and are preferable if we know the size upfront. As Arrays cannot grow as list do.
  • ArrayList and Vector are specialized data structures that uses an Array internally and some convenient methods like add()remove() etc so that they can grow and shrink whenever required.
  • ArrayList supports index based Search with indexOf() and lastIndexOf()methods.
  • Vector is synchronized and threadsafe, but it’s better to use ArrayList and below code for synchronization:
List mylist = Collections.synchronizedList(mylist); 
// Single lock for the entire list
  • Iterator interface is used to cycle through a collection in forward direction only.
  • ListIterator extends Iterator and allow bidirectional traversing.

Collection classes are fail first, i.e. if one thread changes the value while the other on is traversing it will give ConcurentModificationException. Even if we use SynchronizedList/SynchronizedMap still it will throw this exception coz these are conditionally threadsafe, which means individual operations are threadsafe but not the whole operation. So either Synchronize block can be used or ConcurentHashMap or CopyOnWriteArrayList. ConcurentHaspMap, CopyOnWriteArrayList and CopyOnWriteArraySet are thread safe or synchronized.

  • HashMap works on principle of Hashing.
  • Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties.
  • HashMap has an inner class Entry to store key and value mapping.

A hash value is calculated using key’s hash code by calling its hashCode() method. This hash value is used to calculate index in array for storing Entry object. JDK designers well assumed that there might be some poorly written hashCode() functions that can return very high or low hash code value. So there is another hash() function, and passed the object’s hash code to this hash() function to bring hash value in range of array index size.

  • MapReduce has two key components. Map and Reduce. A map is a function which is used on a set of input values and calculates a set of key/value pairs. Reduce is a function which takes these results and applies another function to the result of the map function.
  • It’s a good practice to use Collections.EMPTY_LIST/EMPTY_SET instead of null. E.g.
List testList = Collections. EMPTY_LIST;
  • Best practice to use immutable object as keys to Map.

Arrays.asList(T… a) returns the java.util.Arrays.ArrayList and not java.util.ArrayList. Its just a view of original Array.

  • Comparing ArrayList
Collection listOne = new ArrayList(Arrays.asList(“a”,”b”, “c”,”g”));
Collection listTwo = new ArrayList(Arrays.asList(“a”,”b”,”d”, “e”));
List sourceList = new ArrayList(listOne);
List destinationList = new ArrayList(listTwo);
sourceList.removeAll( listTwo ); // Result: [c, g]
destinationList.removeAll( listOne ); // Result: [d, e]
  • Difference between ITERATOR and ENUMERATION Interface: Iterator actually adds one method that Enumeration doesn’t have: remove(). Iterators allow the caller to remove elements from the underlying collection during the iteration with well-defined semantics.

Question: How can we reverse the order in the TreeMap?

Using Collections.reverseOrder()

Map tree = new TreeMap(Collections.reverseOrder());

Question: Can we add heterogeneous elements into TreeMap?

No, Sorted collections don’t allow addition of heterogeneous elements as they are not comparable. It’s okay if they class implements comparable.

Question: Difference between int[] x; and int x[]

No Difference. Both are the acceptable ways to declare an array.

  • Memory overhead hierarchy:

ArrayList < LinkedList < HashTable < HashMap < HashSet

  • Contains() uses Linear Search.
  • HashMap allows one null key and multiple null values, HashTable doesn’t allow null keys or values.
  • LinkedHashMap can be used to avoid collision, for LRU Cache too.
  • Why ConcurentHashMap is better than HashTable.
  • Sort HashMap by keys and by values.

REST API Design

  • REST (REpresentational State Transfer) is basically an architectural style of development having some principles.
    • It should be stateless
    • It should access all the resources from the server using only URI
    • It does not have inbuilt encryption
    • It does not have session
    • It uses one and only one protocol that is HTTP
    • For performing CRUD operations, it should use HTTP verbs such as get, post, put and delete
    • It should return the result only in the form of JSON or XML, atom, OData etc. (lightweight data )
  • REST based services follow some of the above principles and not all.
  • RESTFUL services means it follows all the above principles.
    • It is similar to the concept of:
      • Object oriented languages supports all of OOPs concepts.
        • examples: Java, C++
      • Object-based languages supports some of OOPs concepts.
        • examples: JavaScript, VB

APIs are interfaces which allows services to expose their endpoint for other service to inter-operate. There is no backend calls to service’s resource (such as accessing DB directly).

Design Considerations

  • Determine data exchange format (JSON or XML)
  • Use noun (eg: /products) for endpoint and verb (see HTTP methods) for doing operations.
  • Use HTTP methods
    • GET
    • POST
    • PUT is idempotent (need to provide complete input-entity in the request and multiple requests will have the same effects)
    • PATCH is non-idempotent (need to provide particular part (patching) of input-entity either to add, update or delete in the request and multiple requests will have the different effects)
    • DELETE
  • Use appropriate HTTP status codes
    • 100 – Informational response
    • 200 – Success
    • 300 – Redirection
    • 400 – Client error
    • 500 – Server error
  • Provide versioning with backward compatibility
  • Use query and filter
    • Eg: Use GET query-param to query (like /products?name=’ABC’) and filter (/products?category=clothing&sex=male) the endpoint.
  • Use pagination (limit and offset query-params)
  • Provide appropriate error message with reason for failure.

Idempotency

  • Idempotent means repeatable-> you repeat, produces same result.
    • GET, PUT, DELETE (every call of same request, returns same response).
    • GET, PUT, DELETE ops will provide same result irrespective of number of requests until there is a change.
  • Non-Idempotent means non-repeatable-> you repeat, produces different result.
    • POST (every call of same request, returns different response)
    • Every POST request, actually create a new entry in the backend.
  • PATCH is neither Idempotent nor Non-Idempotent
    • Based on operation, it become Idempotent nor Non-Idempotent
    • Idempotent scenario: PATCH request with appId (to identify a resource) and partnerProviderID (to change the value), will result in change of partnerProviderID. When you do this ops multiple times, produces same result (i.e., change in partnerProviderID).
    • Non-Idempotent scenario: PATCH request with appId (to identify a resource and also to change its value), will result in change of appId first time, but consecutive request fails (because fail to identify the app). When you do this ops multiple times, produces different result (first time the request changes the appID and second time, the request fails).

CASB

Introduction

  • Cloud Access Security Broker (CASB) is an on-premises or cloud-based security policy enforcement point (PEP) that is placed between user and cloud service providers (CSP) to combine and interject enterprise security policies as cloud-based resources are accessed.

Four Pillars of CASB

Visibility

  • Visibility allows the enterprise to ask the following questions and provide a consolidated view on the answers:
    • Who is accessing what?
    • How often, when and where these services are being accessed  by their employee?
    • What data flowing where?
  • Answers to these questions gives insight to enterprise about the nature of cloud services usage by their employee.
  • Example: CASB should be able to tell you that “Steve” is simultaneously attempting to log into Salesforce from San Francisco and into Box from New York – an indicator of a potential credential compromise.

Compliance

  • Compliance allows the enterprise to know whether it follows the internal and external regulations/ policies all time. Focuses on regulations such as HIPPA, PCI, etc.
  • Example: CASB can provide logs for audit purposes, can encrypt sensitive data-at-rest to protect against breach, and can enforce data leakage prevention policies to control access to regulated data.

Data Security/ Data Loss Prevention

  • Data Security allows the enterprise to enforce data-centric security policy to prevent unwanted access to data.  Also does the Data-Loss Prevention (DLP).
  • Example: An example policy would be all data that goes between user and cloud service is always encrypted and decrypted.

Threat Prevention/ Access Control

  • Access Control allows the enterprise to allow/ deny user access to cloud services based on various criteria. (user behavior, device type, location, risk score, etc.). Also does malware detection and prevention.
  • Example: A sales rep that normally logs into Salesforce and updates some data in his accounts, but then one day logs in and attempts to download the entire company contact database to his BYOD device – a CASB should be able to thwart such risky activity in real-time.

casb1

 

Integration Modes

Stage 1 – Passive/ Non-Intrusive

casb2.png

Stage 2 – Active/ In-Line

casb3.png

casb4

casb5.png

 

Credits: https://www.youtube.com/channel/UC3jldIfC834kazgMlKX4xEA/videos

OWASP

What is OWASP?

  • The Open Web Application Security Project (OWASP) is a non-profit organization dedicated to providing unbiased, practical information about application security.

Some definitions to know

  • Risks: “Risk” of you being get involved in the accident on road.
  • Threats: “Threats” are the one who makes you to involve in the accident.
  • Vulnerabilities: You are the one “vulnerable” to get involved in the accident.

How to avoid these?

  1. Tools and processes
    • This enable developers to find and fix vulnerabilities while they are coding.
  2. Software composition analysis (SCA) 
    • SCA is the process of automating the visibility into open source software (OSS) use for the purpose of risk management, security and license compliance.
    • Because the majority of software creation includes OS, manual tracking is difficult, requiring the need to use automation to scan source code, binaries and dependencies.
  3. Dynamic analysis 
    • Dynamic analysis is the testing and evaluation of a program by executing data in real-time.
    • The objective is to find errors in a program while it is running, rather than by repeatedly examining the code offline. A daily build and smoke test is one type of dynamic analysis.
    • Testing is called as Dynamic Application Security Testing (DAST).
  4. Static analysis
    • Static analysis is a method of computer program debugging that is done by examining the code without executing the program.
    • Testing is called as Static Application Security Testing (SAST).

What is OWASP Top 10?

  • The OWASP Top 10 Web Application Security Risks (updated every year) will provide guidance to developers and security professionals on the most critical vulnerabilities that are commonly found in web applications, which are also easy to exploit.
  • These 10 application risks are dangerous because they may allow attackers to plant malware, steal data, or completely take over your computers or web servers.

The following identifies each of the OWASP Top 10 Web Application Security Risks.

1. Injection

Injection flaws, such as SQL injection, LDAP injection, and CRLF injection, occur when an attacker sends untrusted data to an interpreter that is executed as a command without proper authorization.

  • Example: SQL injection
    • Java Code:
      • String userId = request.getParameter(“UserId”);
      • String query = “SELECT * FROM Users WHERE UserId = ” + userId;
      • Statement st = Connection.createStatement();
      • ResultSet res = st.executeQuery(query);
    • Input:
      • UserId = 105 OR 1 = 1
    • SQL to-be executed:
      • ” SELECT * FROM Users WHERE UserId = 105 OR 1 = 1 “;
    • Output:
      • The above SQL is valid and will return ALL rows from the “Users” table, since “OR 1 = 1” is always TRUE.
  • Prevention
    • Use parameterized SQL queries (i.e., PreparedStatement) instead of direct SQL statements. Modify the “query” to have a question mark (?) instead of concatenating the “userId” directly. Finally set the value of the variable to the supplied userId.
    • Java Code:
      • String query= “SELECT * FROM Users WHERE UserId = ?”;
      • PreparedStatement st = conn.prepareStatement(query);
      • st.setString(1, userId);
      • ResultSet res = st.executeQuery();
    • Input:
      • UserId = 105 OR 1 = 1
    • SQL to-be executed:
      • ” SELECT * FROM Users WHERE UserId = “105 OR 1 = 1” “;
    • Output:
      • Hacking attempts such as “105 OR 1 = 1” now return no results, since there is no record for UserId = “105 OR 1 = 1”.

2. Broken Authentication

Attackers have access to hundreds of millions of valid username and password combinations for credential stuffing, default administrative account lists, automated brute force, and dictionary attack tools. Session management attacks are well understood, particularly in relation to unexpired session tokens.

  • Example
    • Credential stuffing, the use of lists of known passwords, is a common attack. If an application does not implement automated threat or credential stuffing protections, the application shall allow the attacker with hacked credentials.
  • Prevention
    • Multi-factor authentication, such as FIDO or dedicated apps, reduces the risk of compromised accounts.
    • Enable strong password policy and enforce change user-password often.
    • Change default administrative accounts.
    • Enforce session time-out, idle time-out properly.
    • Develop session management of accounts login to your web-server.

3. Sensitive Data Exposure

Applications and APIs that don’t properly protect sensitive data such as financial data, usernames and passwords, or health information, could enable attackers to access such information to commit fraud or steal identities.

  • Example
    • An application encrypts credit card numbers in a database using automatic database encryption. However, this data is automatically decrypted when retrieved, allowing an SQL injection flaw to retrieve credit card numbers in clear text.
    • No salting of password and store password as it is in the backend.
  • Prevention
    • Encryption of data at rest and in transit.
    • Employ strong encryption/ hashing in your crypto-system.
    • Classify the data and apply controls.

4. XML External Entity (XXE)

Attacker can exploit vulnerable XML processors if they can upload XML or include hostile content in an XML document, exploiting vulnerable code, dependencies or integration. Poorly configured XML processors evaluate external entity references within XML documents. Attackers can use external entities for attacks including remote code execution, and to disclose internal files and SMB file shares.

  • Example
    • Disclosing /etc/passwd or other targeted files.
 <?xml version="1.0" encoding="ISO-8859-1"?>
 <!DOCTYPE foo [  
   <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "file:///etc/passwd" >]><foo>&xxe;</foo>

 <?xml version="1.0" encoding="ISO-8859-1"?>
 <!DOCTYPE foo [  
   <!ELEMENT foo ANY >
   <!ENTITY xxe SYSTEM "file:///c:/boot.ini" >]><foo>&xxe;</foo>
  • Prevention
    • Disable XML external entity and DTD processing in all XML parser.
    • Sanitize the XML input.
    • Use less complex data formats such as JSON and avoid serialization of sensitive data.

5. Broken Access/ Authorization Control

Improperly configured or missing restrictions on authenticated users allow them to access unauthorized functionality or data, such as accessing other users’ accounts, viewing sensitive documents, and modifying data and access rights.

  • Example
    • Accessing some data which are not supposed to access
    • https://bank/balance?acc=123
      • Changing value from 123 to 345, will that allow access to see other’s account (345).
  • Prevention
    • Everyone deny by default and grant specific user-role to grant access for each function needed. It is recommended to log failed attempts to access features to make sure everything is configured correctly.

6. Security Misconfiguration

This risk refers to improper implementation of controls intended to keep application data safe, such as misconfiguration of security headers, error messages containing sensitive information (information leakage), and not patching or upgrading systems, frameworks, and components.

  • Example
    • The application server comes with sample applications that are not removed from the production server where attacker can play around with default credentials.
    • The application server’s configuration allows detailed error messages, e.g. stack traces, to be returned to users. This potentially exposes sensitive information.
  • Prevention
    • Lock down all defaults and unused features.
    • Patch the environments as required.

7. Cross-Site Scripting

Cross-site scripting (XSS) flaws give attackers the capability to inject client-side scripts into the application, for example, to redirect users to malicious websites. XSS enables attackers to inject client-side scripts into web pages viewed by other users. They are 3 types:

  1. Persistent/ Stored XSS, where the malicious input originates from the website’s database.
  2. Reflected XSS, where the malicious input originates from the victim’s request.
  3. DOM-based XSS, where the vulnerability is in the client-side code rather than the server-side code.
  • Example
    • Persistent/ Stored XSS
      • The attacker injects a payload in the website’s database by submitting a vulnerable form with some malicious JavaScript.
      • The victim requests the web page from the website
      • The website serves the victim’s browser the page with the attacker’s payload as part of the HTML body.
      • The victim’s browser will execute the malicious script inside the HTML body. In this case it would send the victim’s cookie to the attacker’s server. The attacker now simply needs to extract the victim’s cookie when the HTTP request arrives to the server, after which the attacker can use the victim’s stolen cookie for impersonation.

Diagram of a persistent XSS attack

  • Type 2: Reflected XSS
    • In a reflected XSS attack, the malicious string is part of the victim’s request to the website. The website then includes this malicious string in the response sent back to the user.
    • The attacker crafts a URL containing a malicious string and sends it to the victim.
    • The victim is tricked by the attacker into requesting the URL from the website.
    • The website includes the malicious string from the URL in the response.
    • The victim’s browser executes the malicious script inside the response, sending the victim’s cookies to the attacker’s server.

Diagram of a persistent XSS attack

  • Type 3: DOM-based XSS
    • DOM-based XSS is a variant of both persistent and reflected XSS. In a DOM-based XSS attack, the malicious string is not actually parsed by the victim’s browser until the website’s legitimate JavaScript is executed.
    • The attacker crafts a URL containing a malicious string and sends it to the victim.
    • The victim is tricked by the attacker into requesting the URL from the website.
    • The website receives the request, but does not include the malicious string in the response.
    • The victim’s browser executes the legitimate script inside the response, causing the malicious script to be inserted into the page.
    • The victim’s browser executes the malicious script inserted into the page, sending the victim’s cookies to the attacker’s server.

Diagram of a DOM-based XSS attack

  • Prevention
    • Encoding, which escapes the user input so that the browser interprets it only as data, not as code. Encoding is the act of escaping user input so that the browser interprets it only as data, not as code. The most recognizable type of encoding in web development is HTML escaping, which converts characters like < and > into &lt; and &gt;, respectively.
    • Validation, which filters the user input so that the browser interprets it as code without malicious commands.

8. Insecure deserialization

Serialization may be used in applications for:

  • Remote- and inter-process communication (RPC/IPC)
  • Wire protocols, web services, message brokers
  • Caching/Persistence
  • Databases, cache servers, file systems
  • HTTP cookies, HTML form parameters, API authentication tokens

Applications and APIs will be vulnerable if they deserialize hostile or tampered objects supplied by an attacker. This can result in two primary types of attacks:

  • Object and data structure related attacks where the attacker modifies application logic or achieves arbitrary remote code execution if there are classes available to the application that can change behavior during or after deserialization.
  • Typical data tampering attacks such as access-control-related attacks where existing data structures are used but the content is changed.

Example

  • A PHP forum uses PHP object serialization to save a “super” cookie, containing the user’s user ID, role, password hash, and other state:
    • a:4:{i:0;i:132;i:1;s:7:”Mallory“;i:2;s:4:”user“; i:3;s:32:”b6a8b3bea87fe0e05022f8f3c88bc960“;}
    • An attacker changes the serialized object to give themselves admin privileges:
    • a:4:{i:0;i:1;i:1;s:5:”Alice“;i:2;s:5:”admin“;      i:3;s:32:”b6a8b3bea90s0e0e05022f8f3c88bc895“;}

Prevention

  • Have a safe architecture which doesn’t accept serialized objects from untrusted source.
  • If not, implement integrity checks (signature) for all objects from untrusted source.

9. Using Components With Known Vulnerabilities

Developers frequently don’t know which open source and third-party components are in their applications, making it difficult to update components when new vulnerabilities are discovered. Attackers can exploit an insecure component to take over the server or steal sensitive data.

  • Example
    • Library typically run with the same privileges as the application itself, so flaws in any library can result in serious impact. a Struts 2 library exposes remote code execution vulnerability that enables execution of arbitrary code on the app-server.
  • Prevention
    • Software composition analysis conducted at the same time as static analysis can identify insecure versions of components.
    • Remove unused dependencies, unnecessary features, components, files, and documentation.

10. Insufficient Logging and Monitoring

The time to detect a breach is frequently measured in weeks or months. Insufficient logging and ineffective integration with security incident response systems allow attackers to pivot to other systems and maintain persistent threats.

  • Example
    • A major US retailer reportedly had an internal malware analysis sandbox analyzing attachments. The sandbox software had detected potentially unwanted software, but no one responded to this detection. The sandbox had been producing warnings for some time before the breach was detected due to fraudulent card transactions by an external bank.
  • Prevention
    • Ensure all login, access control failures, and server-side input validation failures can be logged with sufficient user context to identify suspicious or malicious accounts, and held for sufficient time to allow delayed forensic analysis.
    • Monitor and audit the logs often.
    • Think like an attacker and use pen testing to find out if you have sufficient monitoring; examine your logs after pen testing.

Other must to know attacks

  1. CSRF/ XSRF

Anti-CSRF Tokens

The most popular method to prevent Cross-site Request Forgery is to use a challenge token that is associated with a particular user and that is sent as a hidden value in every state-changing form in the web app. This token, called an anti-CSRF token or a synchronizer token, works as follows:

  • The web server generates a token and stores it
  • The token is statically set as a hidden field of the form
  • The form is submitted by the user
  • The token is included in the POST request data
  • The application compares the token generated and stored by the application with the token sent in the request
  • If these tokens match, the request is valid
  • If these tokens do not match, the request is invalid and is rejected

This CSRF protection method is called the synchronizer token pattern. It protects the form against Cross-site Request Forgery attacks because an attacker would also need to guess the token to successfully trick a victim into sending a valid request. The token should also be invalidated after some time and after the user logs out. Anti-CSRF tokens are often exposed via AJAX: sent as headers or request parameters with AJAX requests.

For an anti-CSRF mechanism to be effective, it needs to be cryptographically secure. The token cannot be easily guessed, so it cannot be generated based on a predictable pattern. We also recommend to use anti-CSRF options in popular frameworks such as AngularJS and refrain from creating own mechanisms, if possible. This lets you avoid errors and makes the implementation quicker and easier.

Same-Site Cookies

CSRF attacks are only possible because cookies are always sent with any requests that are sent to a particular origin related to that cookie (see the definition of the same-origin policy). You can set a flag for a cookie that turns it into a same-site cookie. A same-site cookie is a cookie that can only be sent if the request is being made from the origin related to the cookie (not cross-domain). The cookie and the request source are considered to have the same origin if the protocol, port (if applicable) and host (but not the IP address) are the same for both.

A current limitation of same-site cookies is that unlike for example Chrome or Firefox, not all current browsers support them and older browsers do not work with web apps that use same-site cookies (click here for a list of supported browsers). At the moment, same-site cookies are better suited as an additional defense layer due to this limitation. Therefore, you should only use them along with other CSRF protection mechanisms.

  1. Session hijacking attack
  2. Content Spoofing
  3. Buffer overflow attack
  4. Cache Poisoning
  5. Repudiation Attack

 

Cross-Origin Resource Sharing (CORS)

CORS is not wrt cookies, it’s a resource sharing from one domain (1st) context access other resource in other domain (2nd) context (cross-domain) and whether the 2nd domain allows the 1st domain to access the resource or not is determined by the CORS policy configured at 2nd domain.


                    Process flow for CORS requests

Cookie Vs Session Vs Token Vs Claims Authentications

  1. Cookie-Based Authentication
    • Web-client (eg: web-browser) stores cookie sent by the web-server after successful authentication.
    • Cookie contains info about the user, client, authN timestamp and other useful data with unique-id to determine the cookie.
    • Typically, cookie is encrypted by the web-server with domain attribute set (eg: google.com) and send it across to the web-client.
    • Whenever the web-client wants to access the domain resource (eg: mail.google.com), it will send all cookie based on its domain (eg: google.com) to the web-server, which validates/ verifies and grant/ deny access based on state and timestamp of the cookie.
  2. Session-Based Authentication
    • Along with the web-client cookie, if a web-server stores the user authN data in their back-end, then it will be called Session-based authentication.
    • This is very useful in the event of any breach that the web-client gained access to the system where it shouldn’t get access, then from the back-end, the web-client’s session can be revoked by the admin.
  3. Token-Based Authentication (client + user based)
    • Generally this is used in non web-client scenarios, where there is no way to store cookie in the client side.
    • Hence, the web-server sends the signed token (contains info about user, client, authN timestamp and other useful data with unique-id) to the client after successful authentication.
    • Whenever, a client wants to access a resource, it need to send this token and web-server validates/ verifies the token before it allow to access the resource.
  4. Claims-Based Authentication (user attribute based)
    • This is same as token-based authentication + authorization data into the token about the user attributes/ roles/ groups.
    • These data are pertain to authorization, which talks about what the user can do within the resource (eg: mail.read, mail.delete, admin).

WS-Trust|WS-Fed |SAML-P|OAuth|OIDC

Introduction

  • Goal of these web-security standards is to enable federation between partners.
  • All are used to authenticate user at IdP/ STS and get a security token (assertion) for user to access the SP (resource) across the security domains.
  • All these protocols allow to build a trust relationships (using metadata exchange) between IdP and SP for a security token (assertion) exchange.
  • All these protocols will talk about how to transport the tokens, mechanism to issue, sign, encrypt, convert, validate and renew security tokens. (uses XML Security)
  • All these protocols uses SAML1.x or SAML2.0 as a token to exchange identity information between IdP and SP.

WS-Trust

  • WS-Trust is a standard which provide extensions to WS-Security specification that defines protocols to issue, signing, encrypting, converting, validating and renewing security tokens.
  • WS-Trust standard specifies how a web-client, IdP and SP are responsible in exchanging the messages + security token between them.
  • In this standard, both IdP and SP are called as Secure Token Service (STS).
  • On IdP side, the STS issues a SAML security token containing the user’s identity to web-client, which in turn forward to SP to access its resource.
  • On the SP side, the STS validates incoming SAML security tokens from web-client and can generate a new local token for web-client to consume and access its resource.
  • This protocol uses SOAP messages for communication.

WS-Federation

  • WS-Federation is an extension to the functionality of WS-Trust and typically defines the transport mechanism of security tokens and used mainly for web-browser authentication.
  • WS-Federation has two profiles defined:
    • Active Profile Authentication
      • Uses WS-Trust protocol to authenticate user against STS/IdP and provide the SAML security token to the web-client, which in turn submit to STS/SP (which validates the token) in exchange for a local security token between web-client and STS/SP.Typically used for thick-desktop clients.
      • For example: MS-Outlook desktop-client is a web-client, Google as an STS/IdP and Office365 as an STS/SP.
      • Username token (are embedded in SAML token and whole token is SOAP based) is widely used in active profile authentication, where user submits username/ password to web-client and it will hand over to STS/IdP for authentication for a SAML token.
    • Passive Profile Authentication
      • Uses WS-Federation protocol to authenticate user against STS/IdP and provide the SAML security token to the STS/SP (via web-browser), which validates the token in exchange for a local security token between web-browser and STS/SP.
      • Typically used for web-client such as web-browser authentication.
      • It relies on browser redirects, HTTP GET, and POST to request and pass around tokens.
      • For example: Web-browser is a web-client accessing Office365-Sharepoint App, Google as an STS/IdP and Office365 as an STS/SP.
      • Sometimes this profile is also called as Claim-Based Authentication because of SP assert the claim/ attribute present in the security token, which is sent by a trusted issuer or IdP.
      • The communication between STS/IdP and STS/SP are based HTTP query-param key-value:
        • wa: wsignin1.0, wsignout1.0, wsignoutcleanup1.0
        • wtrealm: partnerProviderID of SP or IdP
        • wctx: request context
        • wresult: SAML security token.
  • The communication between STS/IdP and STS/SP are based Request Security Token (RST) and Request Security Token Response (RSTR).
    • In case of Active Profile, the RST and RSTR are SOAP based messages.
    • In case of Passive Profile, the RST is query-param (wa=wsignin1.0) and RSTR is (wresult=SAML security token).

SAML-P

  • More about SAML-P, please refer this link.
  • We can see that WS-Fed (Passive Profile) and SAML-P are used in web-browser authentication.
  • SAML-P uses specific SAML AuthnRequest/ Response protocol message flows, along with other protocols such as Assertion Query Protocol, Artifact Protocol, Name Identifier Management Protocol, Single Logout Protocol, Name Identifier Mapping Protocol, various Bindings and Profiles. There is also lot of flexibility in handling the encryption/ signing part of message in SAML-P.
  • Whereas, WS-Fed uses simple HTTP query-param request/ response style of communication between IdP and SP.

Conclusion

All these protocols looks similar, but all comes with specific use cases:

  • WS-Trust: Covers use-case such as SOAP based SAML security token (RSTR) exchange for a given SOAP request (RST).
  • WS-Federation: Covers use-case such as web-browser authentication with HTTP query-param request (RST) and response in query-param (RSTR).
  • SAML-P: Covers use-case such as web-browser SSO with HTTP-POST, Redirect, Artifact SAML AuthnRequest/ Response message exchange with added flexibility an security in all levels of models. There by, SAML-P covers multiple use-cases than what WS-Fed offers. Please refer this link for detailed note on SAML-P.

Additional Protocols

OAuth2.0

  • Authorization protocol.

OpenID Connect

  • Authentication protocol.

Original OpenID 2.0 vs SAML

They are two different protocols of authentication and they differ at the technical level.

From a distance, differences start when users initiate the authentication. With OpenID, a user login is usually an HTTP address of the resource which is responsible for the authentication. On the other hand, SAML is based on an explicit trust between your site and the identity provider so it’s rather uncommon to accept credentials from an unknown site.

OpenID identities are easy to get around the net. As a developer you could then just accept users coming from very different OpenID providers. On the other hand, a SAML provider usually has to be coded in advance and you federate your application with only selected identity providers. It is possible to narrow the list of accepted OpenID identity providers but I think this would be against the general OpenID concept.

With OpenID you accept identities coming from arbitrary servers. Someone claims to be http://someopenid.provider.com/john.smith. How you are going to match this with a user in your database? Somehow, for example by storing this information with a new account and recognizing this when user visits your site again. Note that any other information about the user (including his name or email) cannot be trusted!

On the other hand, if there’s an explicit trust between your application and the SAML Id Provider, you can get full information about the user including the name and email and this information can be trusted, just because of the trust relation. It means that you tend to believe that the Id Provider somehow validated all the information and you can trust it at the application level. If users come with SAML tokens issued by an unknown provider, your application just refuses the authentication.

OpenID Connect vs SAML

This answer dates 2011 and at that time OpenID stood for OpenID 2.0. Later on, somewhere at 2012, OAuth2.0 has been published and in 2014, OpenID Connect (a more detailed timeline here).

To anyone reading this nowadays – OpenID Connect is not the same OpenID the original answer refers to, rather it’s a set of extensions to OAuth2.0.

While this answer can shed some light from the conceptual viewpoint, a very concise version for someone coming with OAuth2.0 background is that OpenID Connect is in fact OAuth2.0 but it adds a standard way of querying the user info, after the access token is available.

Referring to the original question – what is the main difference between OpenID Connect (OAuth2.0) and SAML is how the trust relation is built between the application and the identity provider:

  • SAML builds the trust relation on a digital signature, SAML tokens issued by the identity provider are signed XMLs, the application validates the signature itself and the certificate it presents. The user information is included in a SAML token, among other information.
  • OAuth2 builds the trust relation on a direct HTTPs call from the application to the identity. The request contains the access token (obtained by the application during the protocol flow) and the response contains the information about the user.
  • OpenID Connect further expands this to make it possible to obtain the identity without this extra step involving the call from the application to the identity provider. The idea is based on the fact that OpenID Connect providers in fact issue two tokens, the access_token, the very same one OAuth2.0 issues and the new one, the id_token which is a JWT token, signed by the identity provider. The application can use the id token to establish a local session, based on claims included in the JWT token but the id token cannot be used to further query other services, such calls to third party services should still use the access token.
  • You can think SAML2 (a token works for both authn/ authz purpose) and OAuth2 (an access token works authz only), as OpenID Connect (ID token works for authn only).

Java Concurrency Concepts

Java Memory Model

  • Describes how threads in Java programming language interact through memory.
  • JVM divides RAM memory between thread stacks and the heap.
    • Thread stack is a secure container contains information on single thread execution of your Java application.
    • Heap contains all objects created in your Java application.

Java memory

  • Stack holds:
    • Local variable which may be:
      • a primitive type, in which case it can be kept on the thread stack. (or)
      • a reference to an object. In that case the reference (the local variable) is stored on the thread stack, but the object itself if stored on the heap.
  • Heap holds:
    • Static class variables are also stored on the heap along with the class definition.
    • Objects on the heap can be accessed by all threads that have a reference to the object.

Volatile Keyword (to avoid invalid object state change among threads)

  • In a multi-core, if two or more threads are sharing an object, without the proper use of either volatile declarations, updates to the shared object made by one thread may not be visible to other threads.
  • Imagine that the shared object is initially stored in main memory. A thread running on Core-1 then reads the shared object into its CPU cache. There it makes a change to the shared object. As long as the CPU cache has not been flushed back to main memory, the changed version of the shared object is not visible to threads running on other CPUs. This way each thread may end up with its own copy of the shared object, each copy sitting in a different CPU cache.
  • To solve this problem you can use Java’s volatile keyword. The volatile keyword can make sure that a given variable is read directly from main memory, and always written back to main memory when updated.

Synchronized (to avoid race conditions among threads)

  • In a multi-core, if two or more threads share an object, and more than one thread updates variables in that shared object, race conditions may occur.
  • Imagine if thread A running in Core-1 reads the variable count of a shared object into its CPU cache. Imagine too, that thread B running in Core-2 does the same, but into a different CPU cache. Now thread A adds one to count, and thread B does the same. Now count has been incremented two times, once in each CPU cache.
  • If these increments had been carried out sequentially, the variable count would be been incremented twice and had the original value + 2 written back to main memory. The two increments have been carried out concurrently without proper synchronization.
  • To solve this problem you can use a Java synchronized block. A synchronized block guarantees that only one thread can enter a given critical section of the code at any given time.
  • Synchronized blocks also guarantee that all variables accessed inside the synchronized block will be read in from main memory, and when the thread exits the synchronized block, all updated variables will be flushed back to main memory again, regardless of whether the variable is declared volatile or not.

Credits: http://tutorials.jenkov.com/java-concurrency/java-memory-model.html

Volatile vs Static

  • Static variables are at the local-thread level or class/ object level, which may be cached by the individual threads.
    • Static variables are stored in heap along with class definition.
    • If one thread modifies it’s cached data, they may not reflect in other threads.
  • Volatile variables are at the global-thread level, which has one main copy shared among all threads.
    • The volatile keyword can make sure that a given variable is read directly from main memory, and always written back to main memory when updated.

https://i.stack.imgur.com/zyhpA.png

ThreadLocal Variable

  • If you have a datum that can vary per-thread, there are two choices:
    • Pass that datum around to every method calls that needs it.
      • This will clutter up your method signatures with an additional parameter.
    • Associate the datum with the thread.
      • This where ThreadLocal used.
  • Many frameworks uses ThreadLocals to maintain some context related to the current thread. For example:
    • When the current ExecutionContextID (eCID) of a web-request is stored in a ThreadLocal, you don’t need to pass it as a parameter through every method call, in case someone down the stack needs access to it.
  • In a non-threaded world, you could solve the problem with the global variable.
  • In a threaded world, the equivalent of a global variable is a thread-local variable.
  • ThreadLocal variables should be always be automatically cleared when they are not needed anymore (i.e., when thread died) to avoid unwanted side-effects and memory leaks in the applications.

OAuth 2.0 & OpenID Connect

OpenID Connect (OIDC) for Authentication

  1. Login the user in.
  2. Making your accounts available in other systems.

OAuth for Authorization

  1. Granting the access to your APIs.
  2. Getting the access to user data in other systems.

Supported Use-cases

  1. OpenID Connect
    • Web App login/ Web-SSO
    • Mobile App Login
  2. OAuth
    • Delegated Authorization

OAuth Terminology

  1. Authorization Server
    • Supports Authorize and Token Endpoints
    • During registration of clients:
      • Requires Redirect URI/ Callback URI.
      • Defines what Resources the client can request.
      • Allows to define Scope of Resources (eg: contacts.R, email.W, openid)
  2. Resource Server
    • Actual resource for which client request on-behalf of user.
  3. Authorization Grants
    • Authorization Code
      • Hit /authorize endpoint to get authorization_code and then /token endpoint to get AT.
        • /oauth/authorize?
          client_id=<client_id>&
          response_type=code&
          redirect_uri=<redirect_uri>&
          scope=<scope>
          -> <authorization_code>
        • /oauth/token?
          grant_type=authorization_code&
          code=<authorization_code>
          -> <AT>
    • Client Credentials 
      • Directly hit the /token endpoint to get AT.
        • /oauth/token?
          client_id=<Base64(client_id:secret)>&
          grant_type=client_credentials&
          scope=<scope>
          -> <AT>
    • Resource Owner Password Credentials
      • Directly hit the /token endpoint to get AT.
        • /oauth/token?
          grant_type=password&
          scope=<scope>&
          username=<username>&
          password=<password>
          -> <AT>
    • Implicit
      • Response Type: token
    • Assertion
      • Response Type:
  1. User Consent
    • User chosen/ consented and allowed scope which is requested by client to Authorization Server on-behalf of user.
  2. Access Token
    • Token generated based on scope/ consent user chosen/ consented only.
    • Use token endpoint to send Authorization Code in exchange for AT.
    • Response Type: token
  3. Refresh Token
    • Scope: offline_access
  4. ID Token
    • Scope: openid
  5. Bearer Token
    • Resource Server accepts the token whoever bearing and submitting the token.
  6. JWT – JSON Web Tokens
    • All OAuth tokens are JWT format and has 3 components:
      1. Header
        • Algorithm (alg)
          • REQUIRED. Algorithm header parameter identifies the cryptographic algorithm used to secure the JWT. Example: RS256/ RS512/ HS256/ HS512.
        • x.509 certificate thumbprint (x5t/ x5t#S256)
          • A digest of X.509 certificate used to sign the JWT.
          • OPTIONAL. x.509 certificate thumbprint header parameter provides a base64url encoded SHA-256 thumbprint (a.k.a. digest) of the DER encoding of an X.509 certificate that can be used to match a certificate.
        • Key ID (kid)
          • OPTIONAL. Key ID header parameter is a hint indicating which specific key owned by the signer should be used to validate the signature. This is particularly useful when you have multiple keys to sign the tokens and you need to look up the right one to verify the signature. Example: SIGNING_KEY/ ENCRYPTION_KEY.
      2. Payload
        • Audience (aud)
        • Subject (sub)
        • Scope (scope)
        • Issuer (iss)
        • IssuedAt (iat)
        • Expiry (exp)
        • Auth_Time (auth_time)
        • Token Type (tok_type)
        • Client ID (client_id)
        • User ID (user_id)
        • Roles (roles)
      3. Signature
        • RSASHA256(base64UrlEncode(header) + "." + base64UrlEncode(payload))

Type of Communication with AuthZ Server

  1. Front Channel Communication 
    • JS client directly ask for AT instead of exchange the Authz code and then AT.
  2. Back Channel Communication
    • Client provide client_id/ secret used to exchange the Authz code for a AT. (trust established between client and AuthZ server)

OAuth flows (grant type flows)

  1. Authorization Code (front and back channels)
    • Web App with server back end.
    • Native mobile app – AuthZ Code flow with PKCE (proof-code for key exchange).
  2. Client Credentials (back channel only)
    • Micro-services and APIs.
  3. Implicit (front channel only)
    • JS App (SPA) with API back end
  4. Resource Owner Password Credentials (back channel only)
    • Desktop clients such as GitHub client

oauth-flow1.png

Problems using OAuth for authentication

  1. No standard way to get the user’s info
  2. No common scope to get user’s info
  3. Implementer headed for custom implementation to get user’s info, which is not consistent.

Open ID Connect (OIDC)

  • Response Type: code + ID token.
  • UserInfo endpoint for getting more user info.
  • Standard set of scopes.
  • Standardized implementation.

openid flows1.png

OAuth Proof Key for Code Exchange (PKCE)

  • The Proof Key for Code Exchange (PKCE) extension describes a technique for public clients to mitigate the threat of having the authorization code intercepted.
  • The attacker intercepts the authz_code returned from the authorization endpoint when a communication path not protected by Transport Layer Security (TLS).

Image result for OAuth Proof Key for Code Exchange (PKCE)

  • The technique involves the client first creating a secret, and then using that secret again when exchanging the authorization code for an access token.
  • This way if the code is intercepted, it will not be useful since the token request relies on the initial secret.
  • code_verifier – A cryptographically random string (43-128 character long) that is used to correlate the Authorization Request to the Access Token Requests.
  • code_challenge – A hash derived from the code verifier that is sent in the Authorization Request, to be verified against later. [i.e.. t(code_verifier)]
  • code_challenge_method – A method that was used to derive code challenge. Preferably ‘S256’ or plain
    • S256 code_challenge = BASE64URL-ENCODE(S256(ASCII(code_verifier))) – t(code_verifier)
    • plain code_challenge = code_verifier

Image result for pkce oauth

Credits: Okta-Youtube link.

Introduction to Deadlocks

What is Deadlock?

  • A system is said be in deadlock state, when it’s process are waiting for a particular “event” to occur that never occurs.
  • The “event” is nothing other than a process in the system releases a resource which is on hold by that process forever.
  • The resources are allocated based on following sequence hence called Resource Allocation Sequence.
    • Request:  A process request for a resource.
    • Use: A process is using the resource.
    • Release: A process releasing the resource.

Deadlock Conditions

  • Mutual Exclusion (Mutex)

    • Resource can be allocated for a single process (i.e., non-shareable resource) and probability of occurrence of deadlock is increased.
    • For example: Process P1 exclusively holding the resource R1 for long time (not forever) without allowing other process P2 to use.
    • Real world example: In a single road (R1), one car (P1) is occupying the whole road not allowing the opposite car (P2) to pass through.
  • Hold and Wait

    • If a process holding on some resource and wait on other resource, then the probability of this process go in a deadlock is increased.
    • For example: P1 is holding on R1 and R2 together and wait for R3 which is held by other process P3 for a long time (not forever) , then P1 is said to become in deadlock.
    • Real world example: Restaurant server (P1) is holding the prepared foods for table1 (R1) and table2 (R2) to be served, until table3 food (R3) is prepared by the chef (P3).
  • No-Preemption

    • The process won’t release the resources until it completes its whole process on all resources, which can take a long time. The probability of other process go in a deadlock can be increased (who waits on some of the resource held by the first process).
    • For example: Process P1 occupying resources R1, R2, R3 and P1 completes the process on R1, but still process on R2 and R3 are pending. P1 won’t release R1 until it completes its process on R2 and R3.
    • Real world example: Kitchen chef  (P1) don’t want to release the prepared food for table1 (R1), until he finishes preparing food for table2 (R2) and table3 (R3).
  • Circular-Wait

    • When processes are waiting for each other to complete in a circular fashion, then processes are said to be in circular wait. (there is no probability, deadlock is occurring for the process).
    • For example: When there are 3 process, each process wait for each other in a circle to complete (P1-> P2-> P3 ->P1), then they are in circular wait deadlock state.
    • Real world example: Circular linked-list process are competing each for a resource and nobody gets that resource.

Conclusion

  • Whenever a Circular-Wait condition occurs, the deadlock is bound to take place, and it will increase other 3 conditions of deadlock in the system. (i.e.Mutex, Hold & Wait and No-Preemption).
  • If conditions such as Mutex (or) Hold & Wait (or) No-Preemption take places, one cannot say a deadlock is bound to occur because they are the probable conditions of deadlocks, hence it’s called as Indirect cause of deadlock.
  • On other hand, if Circular-Wait condition occurs, there is a deadlock in the system for sure, hence it’s called a Direct cause of deadlock.

Other useful definitions

Starvation

  • When a low priority process does not get access to the resources it needs, because there is a high priority process accessing the resources.
  • The entire system of processes hasn’t come to halt in this case; only few processes are starved.
  • If entire processes in the system are starving, then system is said to be in deadlock.
  • Deadlock is an extreme case of starvation with the criterion of extremeness being the total count of process unable to access the resource in the system.

Race Condition

  • A race occurs when two or more threads try to access the same shared data and at least one of the accesses is a write operation.
  • A race condition occurs when two or more threads can access shared data and they try to change it at the same time.
  • Problems often occur when one thread does a “check-then-act” (e.g. “check” if the value is X, then “act” to do something that depends on the value being X) and another thread does something to the value of X in between the “check” and the “act”.
  • Real world example: You are planning to go to a movie at 5 pm. You inquire about the availability of the tickets at 4 pm. The representative says that they are available. You relax and reach the ticket window 5 minutes before the show. I’m sure you can guess what happens: it’s a full house. The problem here was in the duration between the check and the action. You inquired at 4 and acted at 5. In the meantime, someone else grabbed the tickets. That’s a race condition – specifically a “check-then-act” scenario of race conditions.

Critical Section

  • Parts of the program where the shared resource is accessed are protected. This protected section is the critical section or critical region.
  • It cannot be executed by more than one process at a time.

SAML: Web-Browser SSO Profile

Refer SAML: Introduction before continuing this blog.

  • Web-Browser SSO Profile is a use-case, where user achieves single sign-on in a web-browser after successful authentication at IdP and without additional authentication, he/ she can access resources at any different SPs.
  • This is an important use case of SAML and widely used in the enterprise deployments.
  • Web Browser SSO profile supports two SSO methods:
    • SP Initiated SSO 
      • SSO flow initiated at the SP side (Pull method) and assertion received a.
      • Flow starts at SP, then to IdP and finally reaches SP. (i.e., SP -> IdP -> SP).
    • IdP Initiated SSO 
      • SSO flow initiated at the IdP side (Push method).
      • Flow starts at IdP, then pushes the assertion to SP (i.e., (IdP -> SP).
  • SSO Methods uses Request/Response style and has specific protocols to communicate between IdP and SP.
  • Due to two SSO methods and various SAML binding, there are various options available for Web-Browser SSO Profile (Request/Response style). They are:
    1. SP Initiated SSO:
      1. POST->POST binding
      2. Redirect->POST binding
      3. Artifact->POST binding
      4. POST->Artifact binding
      5. Redirect->Artifact binding
      6. Artifact->Artifact binding
    2. IdP Initiated SSO:
      1. POST binding
      2. Artifact binding
    3. ECP Profile
      1. PAOS binding

1. SP Initiated SSO

The following are the SP-initiated SSO flows which are based on binding (transport of SAML Request/ Response).

1.1. POST->POST binding

In this use case the user attempts to access a resource on http://www.abc.com. However they do not have current logon session on this site and their identity is managed by http://www.xyz.com. A SAML <AuthnRequest> is sent to their Identity Provider so that the Identity Provider can provide back a SAML assertion concerning the user. HTTP POST messages are used to deliver the SAML <AuthnRequest> to the Identity Provider as well as receive back the SAML response.

sp-init-1

  1. The user attempt to access a resource on http://www.abc.com. The user does not have any current logon session (i.e. security context) on this site, and is unknown to it.
  2. The SP sends a HTML form back to the browser. The HTML FORM contains a SAML <AuthnRequest> defining the user for which authentication and authorization information is required. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  3. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML <AuthnRequest> to the IdP’s Single Sign-On service.
  4. If the user does not have any current security context on the IdP, or the policy defines that authentication is required, they user will be challenged to provide valid credentials.
  5. The user provides valid credentials and a security context is created for the user.
  6. The Single Sign-On Service sends a HTML form back to the browser. The HTML FORM contains a SAML <Response>, within which is a SAML assertion. The SAML specifications mandate that the response must be digitally signed. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  7. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML response to be sent to the Service Provider’s Assertion Consumer service.
  8. The Service Provider’s Assertion Consumer validates the digital signature on the SAML Response. If this validates correctly, it sends a HTTP redirect to the browser causing it to access the TARGET resource, with a cookie that identifies the local session. An access check is then made to establish whether the user has the correct authorization to access the http://www.abc.com web site and the TARGET resource. The TARGET resource is then returned to the browser.

1.2. Redirect->POST binding

A SAML <AuthnRequest> is sent to their IdP so that the IdP can provide back a SAML assertion concerning the user. A HTTP redirect message is used to deliver the SAML <AuthnRequest> to the Identity Provider and a HTTP POST is used to return the SAML response.

sp-init-2

  1. The user attempt to access a resource on http://www.abc.com. The user does not have any current logon session (i.e. security context) on this site, and is unknown to it.
  2. The SP sends a redirect message to the browser with HTTP status code of either 302 or 303. The Location HTTP header contains the destination URI of the Sign-On Service of the Identity Provider together with the <AuthnRequest> as a query variable named SAMLRequest. The query string is encoded using the DEFLATE encoding. The browser processes the redirect message and issues a GET to the Sign-on Service with the SAMLRequest query parameter.
  3. The Sign-on Service determines whether the user has any current security context on the Identity Provider, or that the policy defines that authentication is required. If the user requires to be authenticated he will be challenged to provide valid credentials.
  4. The user provides valid credentials and a security context is created for the user.
  5. The Single Sign-On Service sends a HTML form back to the browser. The HTML FORM contains a SAML response, within which is a SAML assertion. The SAML specifications mandate that the response must be digitally signed. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  6. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML response to be sent to the Service Provider’s Assertion Consumer service.
  7. The Service Provider’s Assertion Consumer validates the digital signature on the SAML Response. If this validates correctly, it sends a HTTP redirect to the browser causing it to access the TARGET resource, with a cookie that identifies the local session. An access check is then made to establish whether the user has the correct authorization to access the http://www.abc.com web site and the TARGET resource. The TARGET resource is then returned to the browser.

1.3. Artifact->POST binding

A SAML artifact is sent to the Identity Provider (using a HTTP redirect), which it uses to obtain a SAML <AuthRequest> from the Service Provider’s SAML Responder. When the Identity Provider obtains the SAML <AuthRequest> it provides back to the Service Provider the SAML response using the POST binding mechanism.

sp-init-3.png

  1. The user attempt to access a resource on http://www.abc.com. The user does not have any current logon session (i.e. security context) on this site, and is unknown to it.
  2. The SP generates the <AuthnRequest> while also creating an artifact. The artifact contains the source ID of the http://www.abc.com SAML responder together with a reference to the assertion (the AssertionHandle). The HTTP Artifact binding allows the choice of either HTTP redirection or a HTML form as the delivery mechanism to the Service Provider. The figure shows the use of the HTML form mechanism. The Inter-site Transfer Service sends a HTML form back to the browser. The HTML FORM contains the SAML artifact, the control name being SAMLart. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  3. On receiving the HTTP message, the Single Sign-On Service, extracts the source-ID from the SAML artifact. A mapping between source IDs and remote Responders will already have been established administratively. The Assertion Consumer will therefore know that it has to contact the http://www.abc.com SAML responder at the prescribed URL. It sends the SAML <ArtifactResolve> message to the Service Provider’s SAML responder containing the artifact supplied by its Inter-site Transfer Service.
  4. The SAML responder supplies back a SAML <ArtifactResponse> message containing the <AuthnRequest> previously generated.
  5. The Sign-on Service determines whether the user, for which the <AuthnRequest> pertains, has any current security context on the Identity Provider, or that the policy defines that authentication is required. If the user requires to be authenticated he will be challenged to provide valid credentials.
  6. The user provides valid credentials and a security context is created for the user.
  7. The Single Sign-On Service sends a HTML form back to the browser. The HTML FORM contains a SAML response, within which is a SAML assertion. The SAML specifications mandate that the response must be digitally signed. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  8. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML response to be sent to the Service Provider’s Assertion Consumer service.
  9. The Service Provider’s Assertion Consumer validates the digital signature on the SAML Response. If this validates correctly, it sends a HTTP redirect to the browser causing it to access the TARGET resource, with a cookie that identifies the local session. An access check is then made to establish whether the user has the correct authorization to access the http://www.abc.com web site and the TARGET resource. The TARGET resource is then returned to the browser.

1.4. POST->Artifact binding

A SAML <AuthnRequest> is sent to their Identity Provider so that the Identity Provider can provide back a SAML assertion concerning the user. A HTTP POST message is used to deliver the SAML <AuthRequest> to the Identity Provider. The response is in the form of a SAML Artifact. In this example the SAML Artifact is provided back within a HTTP POST message. The Service Provider uses the SAML artifact to obtain the SAML response (containing the SAML assertion) from the Identity Provider’s SAML Responder.

sp-init-4

  1. The user attempt to access a resource on http://www.abc.com. The user does not have any current logon session (i.e. security context) on this site, and is unknown to it.
  2. The SP sends a HTML form back to the browser. The HTML FORM contains a SAML <AuthnRequest> defining the user for which authentication and authorization information is required. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  3. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML <AuthnRequest> to the Identity Provider’s Single Sign-On service.
  4. If the user does not have any current security context on the Identity Provider, or the policy defines that authentication is required, they user will be challenged to provide valid credentials.
  5. The user provides valid credentials and a security context is created for the user.
  6. The Single Sign-On Service generates an assertion for the user while also creating an artifact. The artifact contains the source ID of the http://www.xyz.com SAML responder together with a reference to the assertion (the AssertionHandle). The HTTP Artifact binding allows the choice of either HTTP redirection or a HTML form as the delivery mechanism to the Service Provider. The figure shows the use of the HTML form mechanism. The Single Sign-On Service sends a HTML form back to the browser. The HTML FORM contains the SAML artifact, the control name being SAMLart. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  7. On receiving the HTTP message, the Assertion Consumer Service, extracts the source-ID from the SAML artifact. A mapping between source IDs and remote Responders will already have been established administratively. The Assertion Consumer will therefore know that it has to contact the http://www.xyz.com SAML responder at the prescribed URL.
  8. The http://www.abc.com Assertion Consumer will send a SAML <ArtifactResolve> message to the Identity Provider’s SAML responder containing the artifact supplied by the Identity Provider.
  9. The SAML responder supplies back a SAML <ArtifactResponse> message containing the assertion previously generated. In most implementations, if a valid assertion is received back, then a session on http://www.abc.com is established for the user (the relying party) at this point.
  10. Typically the Assertion Consumer then sends a redirection message containing a cookie back to the browser. The cookie identifies the session. The browser then processes the redirect message and issues a HTTP GET to the TARGET resource on http://www.abc.com. The GET message contains the cookie supplied back by the Assertion Consumer . An access check is then back to established whether the user has the correct authorization to access the http://www.abc.com web site and the index.asp resource.

1.5. Redirect->Artifact binding

Similarly there is a Redirect-> Artifact binding flow.

sp-init-5

1.6. Artifact->Artifact binding

Similarly there is a Artifact -> Artifact binding flow.sp-init-6.png

 

 

 

 

 

 

 

 

 

2. IdP Initiated SSO

The following are the IdP-initiated SSO flows which are based on binding (transport of SAML Request/ Response).

2.1. POST binding

In this use case the user has a security context on the Identity Provider and wishes to access a resource on a remote server (www.abc.com). The SAML assertion is transported to the Service Provider using the POST binding.

idp-init-1.png

  1. At some point the user will have been challenged to supply their credentials to the site http://www.xyz.com.
  2. The user successfully provides their credentials and has a security context with the Identity Provider.
  3. The user selects a menu option (or function) on the displayed screen that means the user wants to access a resource or application on another web site http://www.xyz.com.
  4. The SP sends a HTML form back to the browser. The HTML FORM contains a SAML response, within which is a SAML assertion. The SAML specifications mandate that the response must be digitally signed. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  5. The browser, either due to a user action or via an “auto-submit”, issues a HTTP POST containing the SAML response to be sent to the Service provider’ Assertion Consumer service.
  6. The Service Provider’s Assertion Consumer validates the digital signature on the SAML Response. If this validates correctly, it sends a HTTP redirect to the browser causing it to access the TARGET resource, withing with a cookie that identifies the local session. An access check is then made to establish whether the user has the correct authorization to access the http://www.abc.com web site and the TARGET resource. The TARGET resource is then returned to the browser.

2.2. Artifact binding

In this use case the user has a security context on the Identity Provider and wishes to access a resource on a remote server (www.abc.com). An artifact is provided to the Service Provider, which its can use (that is “de-reference”) to obtain the associated SAML response from the Identity Provider.

idp-init-2.png

  1. At some point the user will have been challenged to supply their credentials to the site http://www.xyz.com.
  2. The user successfully provides their credentials and has a security context with the Identity Provider.
  3. The user selects a menu option (or function) on the displayed screen that means the user wants to access a resource or application on a destination web site http://www.abc.com .
  4. The IdP generates an assertion for the user while also creating an artifact. The artifact contains the source ID of the http://www.xyz.com SAML responder together with a reference to the assertion (the AssertionHandle). The HTTP Artifact binding allows the choice of either HTTP redirection or a HTML form as the delivery mechanism to the Service Provider. The figure shows the use of the HTML form mechanism. The Inter-site Transfer Service sends a HTML form back to the browser. The HTML FORM contains the SAML artifact, the control name being SAMLart. Typically the HTML FORM will contain an input or submit action that will result in a HTTP POST.
  5. On receiving the HTTP message, the Assertion Consumer Service, extracts the source-ID from the SAML artifact. A mapping between source IDs and remote Responders will already have been established administratively. The Assertion Consumer will therefore know that it has to contact the http://www.xyz.com SAML responder at the prescribed URL.
  6. The http://www.abc.com Assertion Consumer will send a SAML <ArtifactResolve> message to the Identity Provider’s SAML responder containing the artifact supplied by its Inter-site Transfer Service.
  7. The SAML responder supplies back a SAML <ArtifactResponse> message containing the assertion previously generated. In most implementations, if a valid assertion is received back, then a session on http://www.abc.com is established for the user (the relying party) at this point.
  8. Typically the Assertion Consumer then sends a redirection message containing a cookie back to the browser. The cookie identifies the session. The browser then processes the redirect message and issues a HTTP GET to the TARGET resource on http://www.abc.com. The GET message contains the cookie supplied back by the Assertion Consumer . An access check is then back to established whether the user has the correct authorization to access the http://www.abc.com web site and the index.jsp resource.

3. ECP Profile

The Enhanced Client and Proxy (ECP) Profile supports several use cases, in particular:

  • Use of a proxy server, for example a WAP gateway in front of a mobile device which has limited functionality.
  • Clients where it is impossible to use redirects.
  • It is impossible for the Identity Provider and Service Provider to directly communicate (and hence the HTTP Artifact binding can not be used).

3.1. PAOS binding

The ECP profile defines a single binding – PAOS (Reserve SOAP). The Profile uses both HTTP  and SOAP headers & bodies to transport SAML <AuthnRequest> and SAML <Response> messages between the Service Provider and the Identity Provider.

ecp

  1. The ECP wishes to gain access to a resource on the Service Provider (www.abc.com). The ECP will issue a HTTP request for the resource. The HTTP request contains a PAOS HTTP header defining that the ECP service is to be used.
  2. Accessing the resource requires that the principal has a valid security context, and hence a SAML assertion needs to be supplied to the Service Provider. In the HTTP response to the ECP an <AuthnRequest> is carried within a SOAP body. Additional information, using the PAOS binding, is provided back to the ECP
  3. After some processing in the ECP the <AuthnRequest> is sent to the appropriate Identity Provider using the SAML SOAP binding.
  4. The Identity Provider validates the <AuthnRequest> and sends back to the ECP a SAML <Response>, again using the SAML SOAP binding.
  5. The ECP extracts the <Response> and forwards it to the Service Provider as a PAOS response.
  6. The Service Provider sends to the ECP a HTTP response containing the resource originally requested.

SSO

Single Sign-On (SSO) is a solution which lets users authenticate at one application and then use that same user session at many completely different applications without re-authenticating over and over again.

HTTP Cookies (RFC 6265) are vital component to achieve SSO, as it tracks the user’s session across different applications. Before going in detail on SSO, let’s look at the HTTP Cookies.

HTTP Cookies

  • An HTTP cookie is a name-value pair data sent by a web-server (party-1) to a web-browser (party-2) to maintain a state between those two parties.
  • Because of state maintenance mechanism, cookies are used for,
    • Tracking user’s information and browsing patterns: For example: preserving cart items in a shopping website and displaying advertisements based on browsing history .
    • User authentication: To know whether the user is logged in or not, and which account they are logged in with.
  • HTTP Cookies (RFC 6265) has defined some attributes associated with cookies: They are:

Attributes

Explanations

Name/ Value Any US-ASCII characters with some exceptions.

For example: SessionID=123

Domain – If specified, the domain and subdomains of the cookie where this cookie can be sent to. For eg: if Domain=example.com; then browser can send cookie to example.com and host1.example.com as well)

– If not specified, domain will be derived from the request-domain (actual hostname from URL) and browser only send to the exact domain and not to its subdomains. For eg: from earlier case, the browser can send only to host that which it set the cookie, i.e., to host1.example.com only.

Path A given URL path where the cookie applies to. For eg: / (i.e., root path)
Expires GMT timestamp that says when the cookie expire.

For eg: 2018-08-24T22:30:17.000Z. If not specified, the cookie is valid until the browser closed.

Max-Age The amount of time (in seconds) the cookie should be valid. For eg: 3600. Both Expires and Max-Age are used for session timeout and context remains same. If both (Expires and Max-Age) are set, Max-Age will have precedence.
Secure When used, the cookie can be only transferred through HTTPS, regular HTTP requests won’t include this cookie.
HttpOnly When used, the cookies aren’t accessible via JavaScript, giving some protection against cross-site scripting (XSS) attacks.
SameSite Prevents the browser from sending the cookie along with cross-site requests. Provides some protection against cross-site request forgery (CSRF) attacks. Possible values are:

– Strict: Cookie won’t be sent to target site in all cross-domain browsing context.

– Lax: Cookie will be sent along with the GET request initiated by the third-party website, but not request to POST or request originated from iframe, img, and script tags.

CSRF attacks are only possible since Cookies are always sent with any requests that are sent to a particular origin, which is related to that Cookie. Due to the nature of a CSRF attack, a flag can be set against a Cookie, tuning it into a same-site Cookie. A same-site Cookie is a Cookie which can only be sent, if the request is being made from the same origin that is related to the Cookie being sent. The Cookie and the page from where the request is being made, are considered to have the same origin if the protocol, port (if applicable) and host is the same for both.

  • Creating a cookie with same name (value can be anything), and set the Expires property to a date in the past will make that cookie to be removed or cleared-out from the browser and user will lose his/ her session.

How Cookies are related to SSO?

Cookies are the ones controls and tracks the user session in the web-browser and that helps to achieve the SSO across all applications within the same domain. Refer below sequence diagram and its explanations in detail.

SSO.png
SSO Sequence Diagram

Note: All servers are hosted in the same domain example.com.

  1. User request page1.html from host1.example.com (Application-1 & Web Agent-1).
  2. host1.example.com sees there is no valid Session Cookie.
  3. Hence redirect the user to sso.example.com (SSO & Authentication Services) for authentication with return URL as page1.html.
  4. Browser request for authentication to Authentication service.
  5. Authentication service displays login page for the user to enter their username/ password.
  6. User submits username/ password.
  7. Authentication service authenticates the submitted credentials.
  8. Upon successful authentication, Authentication service sets redirect URL to page1.html (from #3) and SSO service sets the Session Cookie with the following attributes:

    Cookie Attributes

    Values

    Explanations

    Name/ Value SessionID=123 This Session Cookie is named as SessionID and value as 123.
    Domain example.com This Session Cookie is valid for domain example.com and all its subdomains like sso.example.comhost1.example.com and host2.example.com.
    Path / This Session Cookie applies to path root.
    Expires/ Max-Age 2018-08-24T22:30:17.000Z This Session Cookie will expire on July 24, 2018 at 10:30PM, GMT.
    HttpOnly HttpOnly This Session Cookie is not accessible from   JavaScripts.
    Secure Secure This Session Cookie can be sent only in HTTPS connections.
    SameSite Strict This Session Cookie is not valid for any cross-domain browsing context.
  9. Browser redirect to page1.html at host1.example.com with the Session Cookie.
  10. host1.example.com asks SSO service to validate the received Session Cookie.
  11. SSO service validates and return response to host1.example.com.
  12. Finally host1.example.com allows user to access page1.html.
  13. After sometime, in the same browser session, the same user with valid Session Cookie request to access page2.html at host2.example.com (Application-2 & Web Agent-2).
  14. host2.example.com request SSO service to validate the received Session Cookie.
  15. SSO service validates and return response to host2.example.com.
  16. Finally host2.example.com allows user to access page2.html without asking to re-login at sso.example.com.
    • This is because the Session Cookie is bound and valid to domain example.com, and hence SSO & Authentication Services did not authenticate the user once again.

Conclusion

  • So far we saw SSO works fine across applications when their domain name is same (i.e., sso.example.comhost1.example.com and host2.example.com, all are based on same domain example.com).
  • But what happen if applications running on different domains? For example: sso.example.com, host1.example1.com, host2.example2.com. How to achieve SSO for these applications?
    • The answer is Cross Domain-Single Sign-On (CD-SSO).
  • Let’s discuss about CD-SSO in next blog post. At least this blog gives some basic concepts on SSO and let’s look in the next post for complex use-cases on CD-SSO.

LDAP

Before going into the Lightweight Directory Access Protocol (LDAP), we have to look into what are Directory Services and X500 Model.

Directory Service

  • Directory Service is a concept that characterized as a hierarchical tree-based naming system that offers numerous distributed system advantages:
    • Unique names can be determined by concatenating hierarchical naming components starting at a root node (e.g., “com”).
    • Object-oriented schema and data model supports very fast search operations (uses Key-based querying).
    • Due named sub-tree partitions, the features like Distribution, Replication and Delegated or Autonomous Administration are supported.
    • Supports Authentication and Fine-Grained Access Control mechanisms.
  • Few examples of directories are DNS, NOS, Novell’s NDS, Microsoft’s Active Directory.
  • Directories are designed and optimized for more READ/ SEARCH operations, than WRITE/ UPDATE operations.
  • In real world example, a telephone directory is a directory system that contains a list of subscribers with an address and a phone number in a dictionary format, than in a tree-based fashion.

phonebook.jpeg
Telephone Directory

X500 Model

  • X.500 is a series of computer networking naming standards and models for Directory Service.
  • The primary concept of X.500 is that there is a single Directory Information Tree (DIT), a hierarchical organization of Entries and these can be called using:
    • Distinguished Name (DN) is a unique Distinguished Name. Much like a “path” to a filename in a file system.
      • For eg: cn=alice, ou=People, dc=example.com
    • Relative Distinguished Name (RDN) is a component of the distinguished name. Much like a “filename” in a file system.
      • For eg: cn=alice, ou=People is a RDN relative to the root RDN  dc=example.com
    • dit.png
      Directory Information Tree (DIT)

      • DNs are also called as naming contexts, because it describes the namespace where the entry lives.
  • X500 Model suggest the way how directory service can be designed in a DIT to have a database to fetch information for a given entry.
  • Directory Access Protocol (DAP) is one of the protocol defined in X.500 that says how to access the entries in the DIT on TCP/IP stack.

LDAP

  • The Lightweight Directory Access Protocol (LDAP) is an open, vendor-neutral, industry standard Application Layer protocol interface for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network.
  • The latest specification is Version 3, published as RFC 4511.
  • LDAP is based on:
    • A simpler subset of standards within X.500 model, hence sometimes LDAP is called as X.500-lite.
    • DAP to access X500 model directory, hence it is called as Lightweight DAP.

fromX500toLDAP.png

Naming Model

  • An Attribute defines as a piece of information that the directory entries contain.
  • Attribute is similar to DB column name.
  • The most widely used naming attributes are defined in the following table:
Attr. Abbreviations Explanations Examples
dc domainComponent An element of a DNS domain name dc=acme,dc=com
uid userid A person’s account name uid=jdoe
cn commonName Full name of a person, group, device, etc. cn=John Doe
l localityName Name of a geographic region l=Bay Area
st stateOrProvinceName State or Province st=CA
o organizationName Organization name o=Acme
ou organizationalUnitName Organization unit name ou=Sales
c countryName Two letter country code c=US

Schema Model

  • An LDAP schema is a set of rules that define what can be stored as entries in a LDAP directory.
  • We have similar concept in RDBMS, where DB schema contains information about database structure, tables, columns, data types and constraints.
  • The schema model consists of two types of elements:

1. Object Classes: 

  • Defines as a placeholder for attributes.
  • Similar to DB tables.
  • Object classes come in one of three kinds:
Abstract Structural Auxiliary
  • Defines an attribute or set of attributes that all object classes in an object class structure inherit.
  • Every object class structure must have an abstract object class as the top-level object class.
  • Defines an object entry type.
  • Every entry must contain at least one structural object class.
  • A structural object class inherits either from top or from another structural object class.
  • An auxiliary object class adds attributes to another object class.
  • Useful to define a set of attributes used by multiple object classes.
There are only two abstract classes:

  1. The top class is present in every entry, and it requires the objectClass attribute be present.
  2. The alias requires the aliasedObjectName attribute be present.
For example: Object classes such as organization, person, organizationalPerson or device. For example: Object class strongAuthenticationUser allows the attribute userCertificate;binary be present, but this class could be used in an entry with object class person, device or some other structural class.

ldap
Object Classes in a Class Diagram

2. Attribute Types:

  • Defines the data types of attribute values.
  • Similar to DB column’s data types.
  • Attribute Type Definition specifies:
Syntax Matching Rules
Defines the data format in which an attribute value is stored. A matching rule encapsulates a set of logic that may be used to perform some kind of matching operation against two LDAP values
For example: Directory String, Integer, and JPEG are examples of standard LDAP syntaxes. For example: Comparison, Sort and Order

Protocol Overview

  • A client starts an LDAP session by connecting to an LDAP server, called a Directory System Agent (DSA), by default on TCP and UDP port 389, or on port 636 for LDAPS.
  • The client may request the following LDAP operations:
Operations Explanations
StartTLS Use the LDAPv3-TLS connection
Bind Authenticate and specify LDAP protocol version
Search Search for and/or retrieve directory entries
Compare Test if a named entry contains a given attribute value
Add Add a new entry
Delete Delete an entry
Modify Modify an entry
Modify Distinguished Name (DN) Move or rename an entry
Abandon Abort a previous request
Extended Operation Generic operation used to define other operations
Unbind Close and exit the connection

LDIF

  • LDAP Data Interchange Format (LDIF) is a standard text-based representation for LDAP data.
  • LDIF conveys directory content as a set of records, one record for each entry.
  • It also represents update requests, such as Add, Modify, Delete, and Rename, as a set of records, one record for each update request.
  • This is an example of a simple directory entry with several attributes, represented as a record in LDIF:
dn: cn=John Doe,dc=example,dc=com
cn: John Doe
givenName: John
sn: Doe
telephoneNumber: +1 800 800 8888
telephoneNumber: +1 800 800 9999
mail: jdoe@example.com
manager: cn=Eliza Beth,dc=example,dc=com
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top

Other Useful Notes

  • In LDAP 3.0, there is a special root called rootDSE (where DSE stands for “DSA-specific entry”) is defined as the root of the DIT server.
    • The purpose of the rootDSE is to provide data about the directory server itself.
    • i.e., rootDSE is not part of any namespace, but it contains namingContexts, which provides a list of all DNs in the the DIT.
  • Directory servers publish internal schema as an entry in the directory.
    • It can be retrieved by LDAP clients performing a baseObject search on the a special entry that is defined by the directory server to publish schema information (e.g., cn=schema), with the attributes attributeTypes and objectClasses specified as part of the search criteria.
  • Each LDAP directory has some default schema, which developers can customize, or “extend,” by adding elements to it.

 

Difference between DOM and SAX

  • Both are XML parser, that process the XML document to breaks up the text (element) into small identifiable pieces and which are finally mapped to Objects for application to process the elements.
  • Document Object Model (DOM) processes the entire document and stores the object in a tree structure to manipulate.
  • Simple API for XML (SAX) processes the document as it being read (like streams), generate events based on tags and events are handled by the event handler.

DOM: A tree-based processing

DOM processes the XML document and loads the data into memory in a tree-like structure. Consider the following XML code snippet:

  1. <?xml version=”1.0″?>
  2. <users>
  3.   <user ID=”1”>
  4.     <fname>John</fname>
  5.     <lname>Doe</lname>
  6.     <email>jdoe@force.com</email>
  7.   </user>
  8. </users>

A DOM processor analyzing this code snippet would generate the following tree structure in the memory.

dom.png
In-memory DOM Tree Structure

SAX: An event-based processing

SAX analyzes an XML stream as it goes by. The above example document generates the following events:

  1. Start document
  2. Start element (users)
  3. Characters (white space)
  4. Start element (user) with Attribute (ID=”1″)
  5. Characters (white space)
  6. Start element (fname)
  7. Characters (John)
  8. End element (fname)
  9. Characters (white space)
  10. Start element (lname)
  11. Characters (Doe)
  12. End element (lname)
  13. Characters (white space)
  14. Start element (email)
  15. Characters (jdoe@force.com)
  16. End element (email)
  17. Characters (white space)
  18. End element (user)
  19. End element (users)

The SAX API allows a developer to capture these events and act on them.

Pros and Cons

DOM

  • The tree is persistent in memory; it can be modified so an application can make changes to the data and the structure. It can also work its way up and down the tree at any time.
  • DOM can also be much simpler to use.
  • On the other hand, a lot of overhead is involved in building these trees in memory. It’s not unusual for large files to completely overrun a system’s capacity.
  • In addition, creating a DOM tree can be a very slow process.

SAX

  • Analysis can get started immediately, rather than waiting for all of the data to be processed, hence fast processing.
  • Application is simply examining the data as it goes by, it doesn’t need to store it in memory, hence cost less resource.
  • Application doesn’t even have to parse the entire document; it can stop when certain criteria have been satisfied, hence efficient processing.
  • On the other hand, the application is not persisting the data in any way, it is impossible to make changes to it using SAX, or to move backwards in the data stream.
DOM
SAX
Slow processing
Fast processing
Cost more resource
Cost less resource
Inefficient processing
Efficient processing
Non persistent
Persistent

When to choose DOM and SAX?

Depending on following factors, we can choose DOM or SAX,

  1. Application purpose
    • If application needs to refer back to processed data, make changes to the data and output it as XML, then DOM is a choice. Still SAX can be used, but the process is complex, as the application has to make changes to a copy of the data rather than the original data itself.
  2. XML data size
    • For large files, SAX is a better choice, since it processes the XML data as streams.
  3. Need for speed
    • SAX implementations are normally faster than DOM implementations.

Also note that, SAX and DOM are two implementations of parsing XML data, so we can use DOM to create stream of SAX events, and SAX to create a DOM tree.

 

Design a site like this with WordPress.com
Get started