概念解析

Custom Resource Definition(CRD)是Kubernetes中一種擴展機制,允許用户定義自己的自定義資源類型。CRD使得Kubernetes API可以被擴展以支持新的資源類型,而無需修改Kubernetes核心代碼。

核心概念

  1. Custom Resource(自定義資源):用户定義的Kubernetes資源對象
  2. Custom Resource Definition(自定義資源定義):定義自定義資源的結構和行為
  3. Custom Controller(自定義控制器):監聽自定義資源變化並執行相應操作的控制器
  4. Operator模式:結合CRD和自定義控制器實現應用特定操作的模式

CRD工作原理

  1. 定義階段:通過CRD定義新的資源類型
  2. 註冊階段:Kubernetes API Server註冊新的資源類型
  3. 使用階段:用户可以像使用內置資源一樣使用自定義資源
  4. 控制階段:自定義控制器監聽資源變化並執行業務邏輯

核心特性

  1. API擴展:擴展Kubernetes API以支持新的資源類型
  2. 聲明式管理:支持聲明式的自定義資源配置
  3. 版本控制:支持自定義資源的版本管理和遷移
  4. 驗證機制:支持OpenAPI v3 schema驗證自定義資源配置
  5. 子資源支持:支持/status和/scale等子資源
  6. 轉換機制:支持不同版本間的自動轉換

實踐教程

創建簡單的CRD

# 定義一個簡單的CRD
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.stable.example.com
spec:
  group: stable.example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              cronSpec:
                type: string
              image:
                type: string
              replicas:
                type: integer
  scope: Namespaced
  names:
    plural: crontabs
    singular: crontab
    kind: CronTab
    shortNames:
    - ct

應用CRD:

kubectl apply -f crontab-crd.yaml

創建自定義資源實例

# 創建自定義資源實例
apiVersion: stable.example.com/v1
kind: CronTab
metadata:
  name: my-new-cron-object
spec:
  cronSpec: "* * * * */5"
  image: my-awesome-cron-image
  replicas: 3

應用自定義資源:

kubectl apply -f my-crontab.yaml

管理CRD和自定義資源

# 查看CRD
kubectl get crd

# 查看自定義資源
kubectl get crontabs

# 查看自定義資源詳細信息
kubectl describe crontab my-new-cron-object

# 編輯自定義資源
kubectl edit crontab my-new-cron-object

# 刪除自定義資源
kubectl delete crontab my-new-cron-object

# 刪除CRD(會刪除所有相關自定義資源)
kubectl delete crd crontabs.stable.example.com

真實案例

案例:數據庫即服務Operator

某雲服務提供商需要為用户提供數據庫即服務功能,通過CRD和自定義控制器實現自動化的數據庫管理:

# 數據庫CRD定義
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.database.example.com
spec:
  group: database.example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              engine:
                type: string
                enum:
                - mysql
                - postgresql
                - mongodb
              version:
                type: string
              storageSize:
                type: string
                pattern: '^[0-9]+(Gi|Mi)$'
              replicas:
                type: integer
                minimum: 1
                maximum: 10
              backupEnabled:
                type: boolean
              backupSchedule:
                type: string
              resources:
                type: object
                properties:
                  requests:
                    type: object
                    properties:
                      cpu:
                        type: string
                      memory:
                        type: string
                  limits:
                    type: object
                    properties:
                      cpu:
                        type: string
                      memory:
                        type: string
            required:
            - engine
            - version
            - storageSize
          status:
            type: object
            properties:
              phase:
                type: string
              conditions:
                type: array
                items:
                  type: object
                  properties:
                    type:
                      type: string
                    status:
                      type: string
                    reason:
                      type: string
                    message:
                      type: string
              backupStatus:
                type: object
                properties:
                  lastBackupTime:
                    type: string
                    format: date-time
                  nextBackupTime:
                    type: string
                    format: date-time
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
    - db
    - dbs
  additionalPrinterColumns:
  - name: Engine
    type: string
    jsonPath: .spec.engine
  - name: Version
    type: string
    jsonPath: .spec.version
  - name: Status
    type: string
    jsonPath: .status.phase
  - name: Age
    type: date
    jsonPath: .metadata.creationTimestamp
---
# 數據庫實例示例
apiVersion: database.example.com/v1
kind: Database
metadata:
  name: my-production-db
  namespace: production
spec:
  engine: postgresql
  version: "13.3"
  storageSize: 100Gi
  replicas: 3
  backupEnabled: true
  backupSchedule: "0 2 * * *"
  resources:
    requests:
      cpu: "1"
      memory: 2Gi
    limits:
      cpu: "2"
      memory: 4Gi
---
# 數據庫控制器Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: database-operator
  namespace: database-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: database-operator
  template:
    metadata:
      labels:
        app: database-operator
    spec:
      serviceAccountName: database-operator
      containers:
      - name: operator
        image: database/operator:v1.0.0
        env:
        - name: WATCH_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: OPERATOR_NAME
          value: "database-operator"
        resources:
          limits:
            cpu: 200m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 50Mi
---
# Operator RBAC權限
apiVersion: v1
kind: ServiceAccount
metadata:
  name: database-operator
  namespace: database-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: database-operator
  namespace: database-system
rules:
- apiGroups:
  - database.example.com
  resources:
  - databases
  - databases/status
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
- apiGroups:
  - ""
  resources:
  - pods
  - services
  - persistentvolumeclaims
  - configmaps
  - secrets
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
- apiGroups:
  - apps
  resources:
  - deployments
  - statefulsets
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: database-operator
  namespace: database-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: database-operator
subjects:
- kind: ServiceAccount
  name: database-operator
  namespace: database-system

這種數據庫即服務Operator的優勢:

  • 自助服務:用户可以通過簡單的YAML文件申請數據庫
  • 自動化管理:Operator自動處理數據庫的創建、配置、備份等操作
  • 標準化配置:通過CRD定義標準化的數據庫配置選項
  • 狀態監控:實時監控數據庫狀態並通過status字段反饋
  • 可擴展性:支持多種數據庫引擎和版本
  • 備份恢復:自動處理數據庫備份和恢復操作

配置詳解

高級CRD配置

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: applications.app.example.com
spec:
  group: app.example.com
  versions:
  - name: v1
    served: true
    storage: true
    # OpenAPI v3 schema驗證
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              appName:
                type: string
                minLength: 3
                maxLength: 50
              version:
                type: string
                pattern: '^v[0-9]+\.[0-9]+\.[0-9]+$'
              environment:
                type: string
                enum:
                - development
                - testing
                - staging
                - production
              config:
                type: object
                additionalProperties:
                  type: string
              components:
                type: array
                items:
                  type: object
                  properties:
                    name:
                      type: string
                    image:
                      type: string
                    ports:
                      type: array
                      items:
                        type: integer
            required:
            - appName
            - version
            - environment
          status:
            type: object
            properties:
              phase:
                type: string
                enum:
                - Pending
                - Running
                - Failed
                - Completed
              conditions:
                type: array
                items:
                  type: object
                  properties:
                    type:
                      type: string
                    status:
                      type: string
                      enum:
                      - "True"
                      - "False"
                      - "Unknown"
                    lastTransitionTime:
                      type: string
                      format: date-time
                    reason:
                      type: string
                    message:
                      type: string
    # 子資源配置
    subresources:
      status: {}
      scale:
        specReplicasPath: .spec.replicas
        statusReplicasPath: .status.replicas
        labelSelectorPath: .status.labelSelector
    # 打印列配置
    additionalPrinterColumns:
    - name: Version
      type: string
      jsonPath: .spec.version
    - name: Environment
      type: string
      jsonPath: .spec.environment
    - name: Status
      type: string
      jsonPath: .status.phase
    - name: Age
      type: date
      jsonPath: .metadata.creationTimestamp
  # 多版本支持
  - name: v2
    served: true
    storage: false
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              appName:
                type: string
              version:
                type: string
              environment:
                type: string
              config:
                type: object
              components:
                type: array
                items:
                  type: object
    # 版本轉換配置
    conversion:
      strategy: Webhook
      webhook:
        conversionReviewVersions: ["v1", "v1beta1"]
        clientConfig:
          service:
            namespace: system
            name: webhook-service
            path: /convert
  scope: Namespaced
  names:
    plural: applications
    singular: application
    kind: Application
    shortNames:
    - app
    - apps
  # 保留策略
  preserveUnknownFields: false

CRD驗證規則

# 複雜驗證規則示例
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: validations.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              # 條件驗證
              enableFeature:
                type: boolean
              featureConfig:
                type: object
                properties:
                  param1:
                    type: string
                  param2:
                    type: integer
                # 只有當enableFeature為true時,featureConfig才是必需的
                required:
                - param1
            # 使用allOf進行復雜驗證
            allOf:
            - oneOf:
              - properties:
                  type:
                    enum: [web]
                  webConfig:
                    type: object
                    properties:
                      port:
                        type: integer
                        minimum: 1
                        maximum: 65535
                    required:
                    - port
                  workerConfig:
                    not: {}
              - properties:
                  type:
                    enum: [worker]
                  workerConfig:
                    type: object
                    properties:
                      threads:
                        type: integer
                        minimum: 1
                  webConfig:
                    not: {}
              required:
              - type
    # 默認值設置
    structuralSchema:
      properties:
        spec:
          properties:
            replicas:
              default: 1
  scope: Namespaced
  names:
    plural: validations
    singular: validation
    kind: Validation

故障排除

常見問題及解決方案

  1. CRD創建失敗

    # 檢查CRD定義語法
    kubectl apply -f crd.yaml --dry-run=client -o yaml
    
    # 查看CRD詳細信息
    kubectl describe crd <crd-name>
    
    # 檢查API Server日誌
    kubectl logs -n kube-system -l component=kube-apiserver
    
    # 驗證OpenAPI schema
    kubectl get crd <crd-name> -o yaml
    
  2. 自定義資源驗證失敗

    # 檢查資源定義語法
    kubectl apply -f resource.yaml --dry-run=client -o yaml
    
    # 查看驗證錯誤詳情
    kubectl apply -f resource.yaml 2>&1 | grep -A 10 "ValidationError"
    
    # 驗證schema規則
    kubectl get crd <crd-name> -o jsonpath='{.spec.versions[].schema.openAPIV3Schema}'
    
  3. 控制器無法監聽自定義資源

    # 檢查控制器RBAC權限
    kubectl auth can-i get <crd-plural>.<crd-group> --as=system:serviceaccount:<namespace>:<sa-name>
    
    # 查看控制器日誌
    kubectl logs <controller-pod> -n <namespace>
    
    # 檢查ServiceAccount綁定
    kubectl get rolebinding,clusterrolebinding -n <namespace> | grep <sa-name>
    
  4. 版本轉換問題

    # 檢查轉換Webhook狀態
    kubectl get mutatingwebhookconfigurations,validatingwebhookconfigurations
    
    # 查看Webhook服務狀態
    kubectl get svc <webhook-service> -n <namespace>
    
    # 檢查Webhook證書
    kubectl get secret <webhook-cert-secret> -n <namespace> -o yaml
    

最佳實踐

  1. CRD設計

    • 遵循Kubernetes API約定
    • 使用有意義的命名和結構
    • 提供完整的OpenAPI v3 schema驗證
    • 支持status子資源以報告狀態
    • 定義合理的打印列以改善用户體驗
  2. 版本管理

    • 使用語義化版本控制
    • 保持向後兼容性
    • 提供清晰的版本遷移路徑
    • 實現版本轉換Webhook(如需要)
    • 文檔化版本變更
  3. 安全性

    • 限制控制器權限到最小必要範圍
    • 使用專用的ServiceAccount
    • 驗證所有用户輸入
    • 加密敏感配置數據
    • 定期審查和更新RBAC權限
  4. 監控和日誌

    • 實現控制器健康檢查
    • 記錄關鍵操作和錯誤
    • 提供詳細的事件信息
    • 集成監控和告警系統
    • 實現優雅的錯誤處理
  5. 測試

    • 編寫單元測試驗證控制器邏輯
    • 進行端到端集成測試
    • 測試版本轉換功能
    • 驗證RBAC權限配置
    • 進行壓力和性能測試

安全考慮

控制器RBAC配置

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-operator
  namespace: app-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: app-operator-role
rules:
# 限制對自定義資源的訪問
- apiGroups:
  - app.example.com
  resources:
  - applications
  - applications/status
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
# 限制對核心資源的訪問
- apiGroups:
  - ""
  resources:
  - pods
  - services
  - configmaps
  - secrets
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
# 限制對工作負載資源的訪問
- apiGroups:
  - apps
  resources:
  - deployments
  - statefulsets
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: app-operator-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: app-operator-role
subjects:
- kind: ServiceAccount
  name: app-operator
  namespace: app-system

Webhook配置安全

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: app-mutating-webhook
webhooks:
- name: mutate.app.example.com
  clientConfig:
    service:
      name: app-webhook-service
      namespace: app-system
      path: "/mutate"
    caBundle: <base64-encoded-ca-bundle>
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: ["app.example.com"]
    apiVersions: ["v1"]
    resources: ["applications"]
  admissionReviewVersions: ["v1", "v1beta1"]
  sideEffects: None
  timeoutSeconds: 5
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: app-validating-webhook
webhooks:
- name: validate.app.example.com
  clientConfig:
    service:
      name: app-webhook-service
      namespace: app-system
      path: "/validate"
    caBundle: <base64-encoded-ca-bundle>
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: ["app.example.com"]
    apiVersions: ["v1"]
    resources: ["applications"]
  admissionReviewVersions: ["v1", "v1beta1"]
  sideEffects: None
  timeoutSeconds: 5

命令速查

命令 描述
kubectl get crd 查看所有CRD
kubectl describe crd <name> 查看CRD詳細信息
kubectl get <custom-resource> 查看自定義資源實例
kubectl describe <custom-resource> <name> 查看自定義資源詳細信息
kubectl apply -f <crd-definition> 創建或更新CRD
kubectl apply -f <custom-resource> 創建或更新自定義資源
kubectl delete crd <name> 刪除CRD
kubectl delete <custom-resource> <name> 刪除自定義資源實例
kubectl get crd <name> -o yaml 導出CRD定義
kubectl api-resources --api-group=<group> 查看特定API組的資源

總結

自定義資源定義(CRD)是Kubernetes生態系統中強大的擴展機制,它允許用户定義自己的資源類型並與Kubernetes原生資源一樣進行管理。通過本文檔的學習,你應該能夠:

  • 理解CRD的核心概念和工作機制
  • 創建和管理自定義資源定義
  • 設計符合Kubernetes API約定的自定義資源
  • 實現自定義控制器來管理自定義資源
  • 配置複雜的驗證規則和版本管理
  • 排查常見的CRD相關問題
  • 遵循CRD開發的最佳實踐和安全考慮

CRD是實現Operator模式和構建雲原生應用的重要基礎,掌握CRD的使用將大大提升你在Kubernetes平台上構建和管理複雜應用的能力。