2019年7月22日 星期一

使用 aws-nuke 清除服務 - Clean up AWS resources with aws-nuke

話說有生就有死,有創建就有毀滅,今天就要來說說,怎麼炸掉你的 AWS 帳號~XD
身為 AWS 管理者有顆 AWS 核彈也是很合情合理的吧!?
AWS Nuke 是一個由 Golang 攥寫,透過 AWS SDK 呼叫 API 掃描 AWS resources 並 trigger remove,比原本找到的工具 AWS Weeper 支援還完整,但 AWS 發展速度實在太快,有些服務還未釋出 API SDK,所以也無法保證完全支援。
aws-nuke is stable, but it is likely that not all AWS resources are covered by it.

安裝 Install

有三種方式:
  • 下載解壓縮最新版 Binaries
  • 編譯原始碼 (需安裝 Golang、Glide、golint、GUN Make)
  • Docker

指令參數

  • -c, --config 必填參數,設定檔
  • --profile AWS profile name,使用 AWS API 時需要的權限
  • --access-key-id, --secret-access-key AWS access, secret key,使用 AWS API 時需要的權限;與 --profile 擇一使用
  • --no-dry-run 真的要做刪除時需要此參數,否則只是列出資源
簡單做幾個說明,其他如下自己看
$ aws-nuke -h
A tool which removes every resource from an AWS account.  Use it with caution, since it cannot distinguish between production and non-production.

Usage:
  aws-nuke [flags]
  aws-nuke [command]

Available Commands:
  resource-types lists all available resource types
  version        shows version of this application

Flags:
      --access-key-id string       AWS access key ID for accessing the AWS API. Must be used together with --secret-access-key. Cannot be used together with --profile.
  -c, --config string              (required) Path to the nuke config file.
  -e, --exclude stringSlice        Prevent nuking of certain resource types (eg IAMServerCertificate). This flag can be used multiple times.
      --force                      Don't ask for confirmation before deleting resources. Instead it waits 15s before continuing. Set --force-sleep to change the wait time.
      --force-sleep int            If specified and --force is set, wait this many seconds before deleting resources. Defaults to 15. (default 15)
      --max-wait-retries int       If specified, the program will exit if resources are stuck in waiting for this many iterations. 0 (default) disables early exit.
      --no-dry-run                 If specified, it actually deletes found resources. Otherwise it just lists all candidates.
      --profile string             Name of the AWS profile name for accessing the AWS API. Cannot be used together with --access-key-id and --secret-access-key.
      --secret-access-key string   AWS secret access key for accessing the AWS API. Must be used together with --access-key-id. Cannot be used together with --profile.
      --session-token string       AWS session token for accessing the AWS API. Must be used together with --access-key-id and --secret-access-key. Cannot be used together with --profile.
  -t, --target stringSlice         Limit nuking to certain resource types (eg IAMServerCertificate). This flag can be used multiple times.
  -v, --verbose                    Enables debug output.

使用

須先定義 cofing.yml
regions:
- "global"
- "eu-west-1"

account-blacklist:
- "999999999999" # production

accounts:
  "000000000000": # aws-nuke-example
    filters:
      IAMUser:
      - "my-user"
      IAMUserPolicyAttachment:
      - "my-user -> AdministratorAccess"
      IAMUserAccessKey:
      - "my-user -> ABCDEFGHIJKLMNOPQRST"
  • regions:要掃的 region 範圍,比較特別的是 global,像 IAM 這類型的服務為 global。
  • account-blacklist:保護在這個列表的帳號不被刪除,至少要有一筆
    The config file contains a blacklist field. If the Account ID of the account you want to nuke is part of this blacklist, aws-nuke will abort. It is recommended, that you add every production account to this blacklist.
    To ensure you don’t just ignore the blacklisting feature, the blacklist must contain at least one Account ID.
  • accounts : 要刪除的 Account ID
    • filter : 用來過濾某些資源不被刪除,像是自己的 User ID 或 Default VPC 之類的。
$ aws-nuke -c config/nuke-config.yml --profile aws-nuke-example --no-dry-run
aws-nuke version v1.0.39.gc2f318f - Fri Jul 28 16:26:41 CEST 2017 - c2f318f37b7d2dec0e646da3d4d05ab5296d5bce

Do you really want to nuke the account with the ID 000000000000 and the alias 'aws-nuke-example'?
Do you want to continue? Enter account alias to continue.
> aws-nuke-example

eu-west-1 - EC2DHCPOption - 'dopt-bf2ec3d8' - would remove
eu-west-1 - EC2Instance - 'i-01b489457a60298dd' - would remove
eu-west-1 - EC2KeyPair - 'test' - would remove
eu-west-1 - EC2NetworkACL - 'acl-6482a303' - cannot delete default VPC
eu-west-1 - EC2RouteTable - 'rtb-ffe91e99' - would remove
eu-west-1 - EC2SecurityGroup - 'sg-220e945a' - cannot delete group 'default'
eu-west-1 - EC2SecurityGroup - 'sg-f20f958a' - would remove
eu-west-1 - EC2Subnet - 'subnet-154d844e' - would remove
eu-west-1 - EC2Volume - 'vol-0ddfb15461a00c3e2' - would remove
eu-west-1 - EC2VPC - 'vpc-c6159fa1' - would remove
eu-west-1 - IAMUserAccessKey - 'my-user -> ABCDEFGHIJKLMNOPQRST' - filtered by config
eu-west-1 - IAMUserPolicyAttachment - 'my-user -> AdministratorAccess' - [UserName: "my-user", PolicyArn: "arn:aws:iam::aws:policy/AdministratorAccess", PolicyName: "AdministratorAccess"] - would remove
eu-west-1 - IAMUser - 'my-user' - filtered by config
Scan complete: 13 total, 8 nukeable, 5 filtered.

Do you really want to nuke these resources on the account with the ID 000000000000 and the alias 'aws-nuke-example'?
Do you want to continue? Enter account alias to continue.
> aws-nuke-example

eu-west-1 - EC2DHCPOption - 'dopt-bf2ec3d8' - failed
eu-west-1 - EC2Instance - 'i-01b489457a60298dd' - triggered remove
eu-west-1 - EC2KeyPair - 'test' - triggered remove
eu-west-1 - EC2RouteTable - 'rtb-ffe91e99' - failed
eu-west-1 - EC2SecurityGroup - 'sg-f20f958a' - failed
eu-west-1 - EC2Subnet - 'subnet-154d844e' - failed
eu-west-1 - EC2Volume - 'vol-0ddfb15461a00c3e2' - failed
eu-west-1 - EC2VPC - 'vpc-c6159fa1' - failed
eu-west-1 - S3Object - 's3://rebuy-terraform-state-138758637120/run-terraform.lock' - triggered remove

Removal requested: 2 waiting, 6 failed, 5 skipped, 0 finished

eu-west-1 - EC2DHCPOption - 'dopt-bf2ec3d8' - failed
eu-west-1 - EC2Instance - 'i-01b489457a60298dd' - waiting
eu-west-1 - EC2KeyPair - 'test' - removed
eu-west-1 - EC2RouteTable - 'rtb-ffe91e99' - failed
eu-west-1 - EC2SecurityGroup - 'sg-f20f958a' - failed
eu-west-1 - EC2Subnet - 'subnet-154d844e' - failed
eu-west-1 - EC2Volume - 'vol-0ddfb15461a00c3e2' - failed
eu-west-1 - EC2VPC - 'vpc-c6159fa1' - failed

Removal requested: 1 waiting, 6 failed, 5 skipped, 1 finished

--- truncating long output ---
執行 aws-nuke -c config/nuke-config.yml --profile aws-nuke-example --no-dry-run 指令後,第一次會先列出所有掃到的資源還有狀態,是可以刪除 (would remove),或是設定過濾 (filtered by config),或是其他預設資源不可刪除 (cannot delete default VPC);再次輸入 alias name 以後程式就會開始 triggered remove,並且會持續輸出結果,直到沒有 waiting 狀態,剩餘 failed 就必須檢查看看為什麼失敗,可能是有相依資源,或是有設定保護 (protect policy),例如 CloudFormation stack - Termination protection

心得

整體來說,好用方便快速;但還是有滿多 bug 的,像是刪除 S3 資源時,刪掉 S3 bucket 以後,還會一條一條去 trigger S3 Object,想當然就是一堆 failed 啊;另外,一些有相依資源的,其實如果也在刪除清單內,其實可以調整作動順序,就能順利刪除。
這類工具還是謹慎著用,也提醒我們一點,如果不小心把權限很大的 key 外流了,人家分分鐘就能炸掉你的服務…