Introduction
Hi all and welcome yet again to another blog post. I am writing this straight from the heart of Crete, what a fantastic island. While driving through the mountains earlier today, I got to thinking - "Null" in Terraform, what a mess.... or is it? I have struggled so many times with the concept of "Null" and how it can be used/abused in different Terraform scenarios.
Please below see a list of topics with a small description for each:
We explore the concept of "null" In the most simple form for Terraform - Helps us built more stable infrastructure as code.
How about "null" In resource definitions (provider calls using something like the "azurerm" Provider - Utilizing default values
Lets look at some more advanced examples of encountering "null"
We can use the value of "null" To satisfy Terraform's requirement for "if" Statements to have "consistent conditional types"
Part 1 - The concept of "Null"In the most simple sense
The reason I specifically type either use/abuse is because "Null" in the introduction is because its both something natural to programming, as any programmer will attest to - But what about situations where "Null" causes Terraform code to break in ways not anticipated? Well, I myself know all too well how the datatype can make troubleshooting tough, and this is especially true in cases where "Null" is used within nested data structures, such as objects within objects. To set the stage, let's look at some code to verify some simple Terraform behavior:
Inside variables.tf
Inside main.tf
In other words, as long as we do not depend on the variable for more than a simple output, the implementation described works and we can just define the variable with a value of "null" to simply "ignore" the variable entirely. This can be smart, especially in cases where we create Terraform modules. It makes it possible for us to create "optional" variables. However, here is the point: if we are to use a default value of "null," we must make sure we handle all situations within the code where the "null" value is used as part of something; otherwise, it will fail. For example, if we want to use the variable in a for-loop and the variable remains "null," the loop will fail. See below:
Arh yes, a classic error - Terraform will NEVER allow the use of a "Null" Value when used in the context of a for-loop. Furthermore while we are at it, Terraform will ALSO fail with MOST function calls AND also "for-each" Loop cases of using the "Null" Datatype - So... For the above solution, how can we STILL allow for the variable to be optional while ALSO not risking the code breaking in case the variable is NOT "parsed" When Terraform is executed?
Simple - We replace the default value from "Null" And to a more SPECIFIC type of "empty tupple", e.g. "[<empty tuple>]" See below:
Inside of variables.tf
And watch what happens if we once again execute Terraform:
If we examine the result a little closer, this makes a lot of sense. We asked Terraform to loop through each object within the datatype of "list of objects" as we defined in the "variables.tf". Because Terraform requires ALL for-loop operations to happen within an existing tuple, e.g. [<for loop syntax>], and our list is an EMPTY tuple, we run through the for-loop EXACTLY 0 times, which results in the for-loop being empty. Hence, what's left behind is the empty tuple where the loop spawned in the first place.
To conclude part 1: Always use a more specific type for the default value of any input variable wherever it's possible. It will lead to more stable code.
In part 2, we will begin to look at more complex examples using nested objects.
Part 2 - "null" And how Terraform handles it in resources
This part is all about understanding the power of "null" in the context of resource definitions in Terraform. Since Terraform is a "strongly typed" Language, it requires us to know the types that all provided expressions will result in. Yes, the type of "any" exists, but we will ignore it for this discussion. In general, HashiCorp always recommends that we decide all types ourselves for anything happening inside our Terraform code.
Before exploring the example code, take a look at the following Hashicorp quote about Terraform and "null"
null: a value that represents absence or omission. If you set an argument of a resource to null, Terraform behaves as though you had completely omitted it
The above we can use to our advantage when making Terraform modules. Lets take an example:
Lets say we want to create an "Azure Storage Account Terraform module" That our colleagues can use to define Azure Storage Accounts - To do this, we will naturally take a look at the provider documentation => https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/storage_account
From this documentation we can both see some "Required" Arguments to parse to the resource defintion but also a lot of "optional" Arguments - Now, the thing with "optional" Arguments is obvious.... They are NOT required BUT there is more, if we take a look at the specific argument called "account_kind" Where the documentation specifically states "Defaults to StorageV2" So what this means is, if we do NOT define the possibility to parse the argument "acount_kind" The provider will automatically assign a default value. In other words, Terraform will interpet the missing argument as a simple "null" Value.
Adding to the above - EVEN if we define the possibility to parse the argument of "account_kind," BUT the value set for the argument is "null", Terraform will behave exactly the same as if the argument is not parsed at all - Ignoring the attribute with "null" and letting the provider deal with it. Remember, this ONLY works for "optional" arguments, and possible default values can ALWAYS be seen within the scope of the description of each optional argument on ANY resource type within the Terraform documentation.
To better understand the two paragraphs above and the relevance of all this information, consider the example below.
Inside variables.tf
From the above, we describe 2 different variables - One is more dynamic than the other as it allows for more arguments from the resource provider to be specifically configured. On the other hand - The "strictness" From the 1st variable can also be whats desired in cases where the simple need for more control is required.
Now, whats interesting is to see how Terraform deals with the 2 above variables exactly the same in cases where the variable named "storage_account_object" And its attributes "account_kind" And "account_tier" does not get ANY values parsed, hence defaulting to "null" - Lets see that in action:
Inside main.tf
In the code above, we have provided 2 different examples of a resource definition for the "azurerm_storage_account" provider. The first example only includes all required arguments, while the second example includes 2 optional arguments that can also be included in the resource definition. If we were to attempt deploying both resource definitions and the 2nd example does not include the values for the 2 optional arguments defined, Terraform will generate a plan where both arguments are set to their default values. The effect of this causes the Terraform to simply ignore the 2 defined arguments in the 2nd resource definition which in turn results in both resource definitions ending with the same default values as even though the 1st resource definition does NOT have the 2 arguments defined, the provider will STILL apply its default values to BOTH Azure Storage Accounts.
In other words - Because the input variable "stoage_account_object" Is "null" The 2 optional attributes will be ommited by Terraform, BUT it will not fail, hence the 2nd resource definition is more "dynamic"
To conclude part 2: When it comes to simple types for attributes defined for either input variables OR local variables, we can still easily create expressions using them WITHOUT Terraform failing due to a "null" Value as long as this expression does not include functions, loops or complex data types. By utilizing "null" In the above code example, we are able to make resource definitions more dynamic without risking less stable Terraform code.
In part 3 we will begin to look more specifically at how to deal with "null" Values within nested attributes in objects, list of objects or in maps.
Part 3 - "null" In nested objects
This part is all about understanding how Terraform behaves when it comes to any optional attributes configured for a type in either an input variable or local variable of nested values – As we have already touched on in section 2, Terraform aims to “omit” Every single value resulting in “null” The issue here comes in cases where input variables allow for a “dynamic” Setup where a mix of attributes are optional – Especially in cases where attributes leads to inner nested objects.
Say we have the variable as below:
Inside variables.tf
In the above code we define an object consisting of 2 attributes, "name" Which is required and an attribute called “nested_objects” Which is an optional "list of object". Now.. What happens if we try to run our Terraform code now?
Lets create a simple output for the specific input variable defined above where we only run with the “name” Attribute to be parsed as part of the object type definition:
As expected, because the input variable´s type definition is allowing “optional” Attributes, Terraform wont fail when only the “name” Of the root object “nested_objects_with_optional” Is parsed. Notice how the entire “list of object” Type attribute with name “nested_objects” Is simply defaulting to “null” Which means if we ever want to loop through this attribute, we first must make sure that such loop ONLY runs in cases that the attribute is NOT “null” Otherwise we will end up with very unstable Terraform code that will FAIL every time the “optional” Attribute is simply not set.
Furthermore, in the output we can see that Terraform does not even see the nested attributes called "attribute1" And "attribute2" Which we defined in the variable type definition - This point is extremly important because its speaks to how Terraform behaves on nested attributes.
Just to firstly showcase this issue – And yes, I myself have fallen victim to this so many times – You begin to add optional attributes and begin to “depend” on them in the actual Terraform code – Which will work just fine in tests, right until someone simply ignores the attribute which can cause all sorts of “null” errors depending on what exactly the attribute is supposed to do.
Lets see that in action in the below code snippet:
Inside main.tf
The reason the output is simply 0, 0, 0 is because our local variable “local.inner_attributes" Returns “null” Because we check whether the nested attribute called “nested_objects” == null, which it is at this point – What the above code does is safeguard us from the simple situation of someone NOT parsing the optional attributes as part of any object which can always happen when using “dynamic” Variabes.
Now, lets try to run Terraform again but this time add more than just the required attribute "name" Of the root object in the variable definition - We also add 1 object within the "list of object" Definition defined BUT we only add a value to the required "attribute1" What is the result of a new output of the same variable?
Notice how the local variable we used to verify for "null" Now actually returns us the wanted “list of object” As part of the “inner” Object definition – BUT "attribute2" Is STILL “null” Because its optional and we did not define it – This is completely fine, because we made sure that the overall attribute “nested_object” Is NOT “null”
Furthermore and something else which is very important to note about the above Terraform behavior is that, if we want to use the variable in some resource definition, we must continue to safeguard us against the potential of the attribute "nested_objects" Failing in cases where its NOT provided as part of being an optional attribute for the Terraform code.
To showcase this, see the code snippet below - The resource definition is completely generic because this behavior has everything to do with the "core functionality" of Terraform and nothing to do with any specific provider.
Inside main.tf
If we execute Terraform and the attribute "nested_objects" Is "null" The call to index "[0]" Will fail due to the simple fact that its impossible to retrieve an index from a "null" Value - To fix this, we can use 2 different approaches:
Approach 1 - Check the root attribute "nested_objects" For "null" Directly at the location of the arguments to parse for the resource definition
EVEN if the optional attribute called "attribute2" Is "null" This code will NOT fail, simply because Terraform at this stage understands the underlying type definition of the attribute "nested_objects"
Approach 2 - Use the function "can" To verify whether the expression returns an exception - If it does, the attribute of "nested_objects" Must be "null", Since the "can" Function failed to index into the "list of object"
The fundamental difference between the two approaches comes down to their "use of application." The general use of the function "can" Is extremely helpful in more advanced situations where checking for a simple "null" Value in an attribute is not enough to satisfy the check we want to do in an expression for an argument to be parsed to the resource definition. These more "advanced" Examples are out of scope for this blog post. Finally, be very careful with the use of "can" As the actual exception thrown in cases where the inner expression failed will NOT be parsed to std output, which can make code way harder to troubleshoot.
To conclude part 3: Always make sure to safeguard your Terraform code to prevent any "null" exceptions from being thrown due to inner object nesting, where optional attributes are allowed. This pattern must always be persisted across all situations where the risk of encountering null can occur. Use "variable transformation" Techniques within the scope of "local" Variables whenever possible. This can help you establish more general safeguards against encountering "null" rather than having to perform expression checks on individual arguments for any resource and data definition.
Part 4 - "null" Our ally when it comes to "if" Statements
This part of the blog post aims to help us understand both how Terraform´s "type safety" / "strongly typed" Effects how the langauge deals with "if" Statements and also how we can use "null" To ALWAYS make sure a given statement is valid.
The most common error that can occur in ANY "if" Statement comes from the simple fact that Terraform requires BOTH sides of such an expression (both true and false results) to be able to convert to the same base type - In other words, having "if" Statements like the following below examples are not valid:
One VERY important thing to node about the above examples is the simple fact that actually the local variable of "example1" Will in fact NOT fail, due to Terraform will always TRY to automatically convert the true and false types to the same base type - This is possible, because Terraforrm is able to find the common type of "string" For both sides of the "if" Statement. This behavior is extremtly interesting as it slightly contradicts Terraform`s requirement for all types to be "consistent", yet its possible in some situations for 2 different types to convert to the same base.
The entire list of possible conversions can be seen on the following link to the Hashicorp Terraform docs describing type conversion => Types and Values - Configuration Language | Terraform | HashiCorp Developer
Now.. For all the situations where Terraform CANNOT automatically convert both sides of an "if" Statement to the same base type, the following error occurs, as it will for the local variables of "example2" And "example3"
Note on the above Terraform exception, Terraform says the true side type is "tupple" Which is simply the compiler being lazy and failing before even looking into the "tupple" To spot the actual type of "list of string"
So - How do we deal with the exception of "Inconsistent conditional result types" You might ask? Well, lets add "null" As the value for the false side of the "if" Statement, will it fix our issue?
Because the value type of "null" Is special, Terraform allows it to satisfy any conditional "imbalance" between the true and false sides of an "if" statement. (The output will now end up being "null" Which causes the output to be ommited)
Now it's reasonable to think that, well, now we have another issue. The "null" Value that can result from the "if" Statement can now cause other issues for any dependencies using the local variable, and yes, this is correct - we must be careful with the use of this "trick." On the other hand, having a result of "null" Can be used to our advantage in situtations where the result of such "if" Statement must be "conclusive" As a result of "null" Is typically used by us to indicate that something is to be discarded / ommited / not used later on in our code for a given situtation where this makes sense.
To conclude part 4: The use of "null" Versus manually defining the same base type on both sides of an "if" Statement comes down to the scenario and the need for the given Terraform codebase. It is highly recommended to have specific coding standards defining the use of "null" As a possible value used in "if" Statements, as it can lead to cascading "null" Errors for any dependencies. However, using "null" Becomes more and more effective in more complex conditional expressions, but be careful; It's a balance to maintain - hence having robust coding standards is a must for creating a maintainable codebase.
And that is 1 step closer to Successfully deal with Terraform´s "null" Value
Well, that was all for today. Thank you so much for reading along; I really, really appreciate it. Also, HAPPY SOMMER! I hope you and your family & friends will have a fantastic time!
Until our next advanture - Peace
PS.
Want to learn more about Terraform? Click here -> terraform (codeterraform.com)
Want to learn more about other cool stuff like Automation or Powershell -> powershell (codeterraform.com) / automation (codeterraform.com)
Comments