CDOSS Certificate

BIG DATA ADMIN MASTERY

Profiles that can prepare this certification contents: Data engineer, Big Data Consultant.

Global knowledge to be acquired to pass this certification: 

  1. Kerberos Fundamentals
  • KDC Setup: krb5-kdc, kdb5_util create -s, realm configuration (/etc/krb5.conf)
  • Admin Tools: kadmin.local, addprinc, ACL management (kadm5.acl)
  • HA KDC: Database replication (kdb5_util dump/load), multi-KDC config
  1. Time Synchronization
  • Chrony/NTP: Server/client configurations (allow subnet directives), iburst for rapid sync
  • Time Drift Impact: Kerberos ticket expiration failures
  1. Cloudera Manager Security
  • Kerberos Enablement:
    • Wizard steps (realm, KDC server, encryption types: aes128-cts, arcfour-hmac)
    • Principal authentication (admin/admin)
  • Service Restarts: Cluster relaunch post-Kerberization
  1. HDFS Security & HA
  • NameNode HA:
    • Failover testing (kill active NN → verify standby takeover via master:9870)
  • Kerberized Operations:
    • Directory creation: sudo -u hdfs kinit -kt hdfs.keytab → hdfs dfs -mkdir
    • Ownership: hdfs dfs -chown user:group /path
  1. Hive Authorization
  • Impersonation: hive.server2.enable.doAs=true (user differentiation)
  • RBAC:
    • Enable: hive.security.authorization.enabled=true
    • Admin role: hive.users.in.admin.role=hive
    • Privileges: GRANT/REVOKE SELECT ON TABLE …
  1. HBase ACLs
  • Coprocessors: AccessController for master/regionserver
  • Permission Letters:
    • R (Read), W (Write), C (Create), A (Admin)
  • Commands: grant, revoke, user_permission
  1. YARN Queue Management
  • Dynamic Queues:
    • Placement rules (e.g., route user → root.system.user)
    • Auto-queue creation
  • Legacy vs. Modern:
    • capacity-scheduler.xml (old) vs. CM Queue Manager UI + ZK (new)
  1. User Provisioning
  • Linux: adduser, usermod -aG group
  • Kerberos: kadmin.local addprinc user@REALM
  • HDFS Home Dirs