Great for automating repetitive and similar tasks
Use Cases and Deployment Scope
There are two main problems the product solves for us. The first one is that it maintained our old EC2 image and the Matillion ETL instance it held; including backups, scheduled spin up times, snapshots etc. The second is to ingest raw data from our S3 data lake into Snowflake as raw data, and also to flatten it to tabular data.
Pros
- Standardised templating - Great for repetitive tasks or processes. For example flattening JSON files into tables. We can make a Terraform template which can be duplicated and used for many different file schemas.
- Dynamic/static variables - Getting information about the stage/instance Terraform is running in can help dynamically generate assets. E.g. making prod vs dev vs non-live assets based on the instance to Git branch.
- Vast array of connectors - Terraform seems to connect with lots of different technologies, and are often maintained by the tech provider so you know it's going to be designed in the best way possible for that tech.
Cons
- Steep learning curve - As with most IaaC tools, it can take time to learn the architecture of Terraform, nomenclatures, process of implementation etc.
- Hanging state lock files - Potentially an implementation issue rather than an terraform issue, but if a state lock file isn't properly closed/unlocked then that instance of terraform can't be tested or ran again until the file becomes unlocked. E.g. stopping a Terraform PLAN pipeline half way through means the state lock file will hang/remain locked until it's explicitly told to be unlocked again.
- Unhelpful errors - Potentially an issue with the connector provider, but some errors can be generic, or too much information is given meaning finding the root cause can be tricky.
Return on Investment
- More developer time freed up - Not having to worry about our old EC2 machine means we can free up developer time. It takes care of the spin up schedules, the EC2 snapshots, and even certain requirements for the METL instance that resides in the machine
- Reduced time to data ingestion - Now we have a template for data ingestion (JSON > SQL-ready tables) it means we can stage, copy, and flatten data from S3 to Snowflake very quickly. Depending on the complexity of the ingest request, it might only take a few days to implement.
- Long PLAN times - The larger your estate, the longer it can take for Terraform to PLAN it's changes. In some cases developers would have to wait a few hours for Terraform to complete it's PLAN stage only to be told there's an error somewhere in the code. This might be more of an implementation issue, but just something to be aware of.
Usability
Alternatives Considered
dbt
Other Software Used
Postman, dbt, Snowflake, Matillion, AWS IAM Identity Center, Apache Airflow, Amazon QuickSight, Amazon S3 (Simple Storage Service), Looker, Google BigQuery





