How should I manage secrets that Chef needs to configure my systems? This is a question that many of us that use Chef struggle with. Should I use encrypted data bags? What about chef-vault, what is it and how does it improve on encrypted data bags? Does it matter whether our infrastructure is cloud, on-prem or a mix of both?
Encrypted data bags, and tooling like chef-vault built around them, can be great mechanisms for secrets storage and retrieval in certain contexts but there are disadvantages to this solution that we should be aware of. Depending on your environment the following drawbacks may or may not matter to you.
1. Encrypted data bags solve one problem and create another.
You store your secrets in data bags and use a key to encrypt them so you can check the JSON into source control. You secrets are now encrypted and available, but only to those people/machines that have the decryption key. Now you have a key management problem. How do I make sure the right people and machines get these keys securely?
2. chef-vault's best feature is also its worst feature
chef-vault solves the key management problem with encrypted data bags by encrypting duplicate data bag items with the public keys of nodes that require access. This is great, we can use the private/public key scheme that is already on the Chef nodes.
The downside here is that we need to know the list of nodes at encryption time. Every time we want to add/remove a node or admin we have to re-encrypt the data bag (vault) with our new list. Getting this to work with autoscaling or self-healing systems requires a lot of work. If a tool is fighting you when you are adapting it to your use case, it's probably not the right tool.
3. Least privilege gets complicated, fast
When using encrypted data bags you are limited to one decryption key per node. That means that if you have anything but a dead-simple infrastructure you are going to need a lot of encrypted data bags. Enforcing least privilege will require you to set up and maintain a different data bag for every permutation of secrets a node will need.
For example, Service A needs the PostgreSQL password and AWS keys. Service B needs the AWS keys as well but doesn't deal with PostgreSQL - oh and it also needs your Hipchat API token. You can see how this can get out of hand really quickly. Our infrastructure is also not static - when Service A now needs the RabbitMQ creds we need to update the data bag. As services change, are born or retire we need to spend a good amount of time managing our data bags to keep up.
No one wants to do this maintenance, so practically what this means is that we end up stuffing all of our secrets into one data bag and feeling bad about ourselves.
4. Auditing is important
Keeping a easily accessible and searchable audit log of access to secrets in your system is a necessity whether your organization is legally required to be compliant or not. You can demonstrate your PCI/SOX/etc compliance more easily with this data. If a breach occurs, having a log of access will allow you to find out what happened and react more quickly. Access logs for data bags will not be enough to make sense of your system.
5. We write cookbooks
We are infrastructure engineers - we write cookbooks and put them through their paces with test-kitchen. Mocking out data bags with dummy attributes in our .kitchen.yml or gitignoring a .kitchen.local.yml or data bags containing real secrets is not ideal. Usually this just slows us down. Our CI environments should have access to secrets for integration and acceptance tests too right? Treating dev and test environments differently than prod when it comes to secrets undermines the parity that we have been working so hard to achieve.
I've laid out the circumstance in which I think using encrypted data bags don't work. So what are we supposed to do? You could roll your own solution - a lot of people do this. Most aren't very happy with maintaining it. Or you could come talk to us at Conjur - we think we've got a pretty compelling solution to the problem.