Amazon Redshift is a hosted data warehouse solution, from Amazon Web Services.
$0.24
per GB per month
Teradata Vantage
Score 8.3 out of 10
N/A
Teradata Vantage is presented as a modern analytics cloud platform that unifies everything—data lakes, data warehouses, analytics, and new data sources and types. Supports hybrid multi-cloud environments and priced for flexibility, Vantage delivers unlimited intelligence to build the future of business.
Users can deploy Vantage on public clouds (such as AWS, Azure, and GCP), hybrid multi-cloud environments, on-premises with Teradata IntelliFlex, or on commodity hardware with VMware.
If the number of connections is expected to be low, but the amounts of data are large or projected to grow it is a good solutions especially if there is previous exposure to PostgreSQL. Speaking of Postgres, Redshift is based on several versions old releases of PostgreSQL so the developers would not be able to take advantage of some of the newer SQL language features. The queries need some fine-tuning still, indexing is not provided, but playing with sorting keys becomes necessary. Lastly, there is no notion of the Primary Key in Redshift so the business must be prepared to explain why duplication occurred (must be vigilant for)
Teradata Vantage is well suited for large scale ETL pipelines like the ones we developed for anti money laundering risk matrices. It handles heavy joins, aggregations, and transformations on transactional data efficiently. We generate alert variables, adjust for inflation, and monitor establishments monthly with it, all integrated with Python and Control-M for a centralised automation across the company. For less appropriate, I would say that heavy resource demands might slow down experimentation for iterative work.
Redshift is fully managed. Small teams do not have the resources to maintain a cluster. CloudWatch metrics are provided out-of-the-box, and it is easy to configure alarms.
Redshift's console allows you to easily inspect and manage queries, and manage the performance of the cluster.
Redshift is ubiquitous; many products (e.g., ETL services) integrate with it out-of-the-box.
Writing .csvs to S3 and querying them through Redshift Spectrum is convenient.
It could benefit from adding data integrity and programming tools common to other database management systems.
Amazon Redshift is based on PostgreSQL 8.0.2. That version of PostgreSQL was released in December 2006. While PostgreSQL was much improved since then, the new features were not implemented in Redshift. Many basic features are missing from it.
Primary keys can be declared but not enforced. Referential integrity (foreign keys) can be declared but not enforced. UNIQUE and CHECK constraints are not supported and cannot be declared.
IDENTITY can be declared on a column, and Redshift will put unique values into it. However: IDENTITY values in the newly inserted rows won’t be incremental or sequential. To implement a sequential number, you need to write your own custom code.
There are no stored procedures in Redshift. We are writing SQL script files, and then parsing and running them one statement at a time from a Python program. This also enabled us to implement execution-time error logging.
In SQL scripts, to check for the row count of affected rows, a complicated join query against some system tables or views has to be executed.
Data Control Language (DCL) does not exist. No statements like IF, WHILE, DO, RAISERROR, etc.
On performance of views… Views do not “pass-through” a query parameter which is a potential problem for performance.
When selecting against a view with the WHERE clause outside of the view, the inner query of the view will be executed first without consideration for the WHERE clause, and only then the WHERE clause will be applied.
Certain clauses of SQL work many times faster than other clauses. So be careful and test your statements for performance earlier rather than later, especially if working with a large data set.
There was a situation when DELETE FROM JOIN was unacceptably slow. Replacing JOIN with the USING clause made DELETE instantaneous.
Teradata can improve by supporting more native AWS cloud features. Currently if a node goes down the EC2 instance must be restarted. It isn't something that happens frequently but more tight integration with cloud providers like AWS and Azure will allow Teradata to offer truly dynamic scaling.
Some Teradata features are oversold before they are ready for prime-time. Teradata is not unique in this but if something is sold as an integrated product stack it should really be integrated not something that requires an extensive development cycle to be integrated at a customer's expense. If something is supported it should've really be tested and QAed thoroughly before a customer touches it.
Teradata is a mature RDBMS system that expands its functionality towards the current cloud capabilities like object storage and flexible compute scale.
Overall it serves all our aspects of data management like data cleaning, data manipulation, and data reporting on the cloud platform. We can create stored procedures and triggers in it very easily as all the options are self suggested in it. We can easily attach the results of ARS to the other tools as well for drawing the statistical results.
Teradata Vantage allows us to create a scalable infrastructure to support our strategic initiatives. The dedicated compute power ensures reliable performance with isolated workloads and dedicated resources, optimizing workflows for faster, more efficient data transfers. The compute clusters support ETL processes and OSF’s developers and data science team with the flexibility to create self-service analytics, to spin up/down at any time, driving better performance and minimizing costs.
The support was great and helped us in a timely fashion. We did use a lot of online forums as well, but the official documentation was an ongoing one, and it did take more time for us to look through it. We would have probably chosen a competitor product had it not been for the great support
We have meetings at the beginning with the technical team to explain our requirements to them and they were really putting in a lot of effort to come up with a solution which will address all our needs. They implemented the software and also trained a few of our resources on the same too. We can get in touch with them now as well whenever we run into a roadblock but it's very less now.
We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014. Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure. Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a data store. [Amazon] Redshift was ideal with the price structure, performance and ROI[.]
Teradata is way ahead of its competitor because of its unique features of ensuring data privacy and data never gets corrupted even in worst case scenario. In most cases, the data corruption is a major issue if left unused and it leads to important data being wiped off which in ideal case should be stored for 3 years
Teradata is been absolutely phenomenal for our project because we feed huge chunks of data to it and get back the desired results in no time which earlier used to take hours to process and then also sometimes timeout.
We don't have to do any manual intervention for resource or task allocation, it is all taken care by Teradata internally and all the AMP's are given equal amount of work and have their own resources to complete them with no sharing with another.