It is very scalable and it gives us backup for everything. Because of this feature, we (as a developer) can do any R&D if required. It is very stable so we can get any type of help. It has a recovery feature also which we used recently for my project. It is very easy to recover.
If the number of connections is expected to be low, but the amounts of data are large or projected to grow it is a good solutions especially if there is previous exposure to PostgreSQL. Speaking of Postgres, Redshift is based on several versions old releases of PostgreSQL so the developers would not be able to take advantage of some of the newer SQL language features. The queries need some fine-tuning still, indexing is not provided, but playing with sorting keys becomes necessary. Lastly, there is no notion of the Primary Key in Redshift so the business must be prepared to explain why duplication occurred (must be vigilant for)
Redshift is fully managed. Small teams do not have the resources to maintain a cluster. CloudWatch metrics are provided out-of-the-box, and it is easy to configure alarms.
Redshift's console allows you to easily inspect and manage queries, and manage the performance of the cluster.
Redshift is ubiquitous; many products (e.g., ETL services) integrate with it out-of-the-box.
Writing .csvs to S3 and querying them through Redshift Spectrum is convenient.
It could benefit from adding data integrity and programming tools common to other database management systems.
Amazon Redshift is based on PostgreSQL 8.0.2. That version of PostgreSQL was released in December 2006. While PostgreSQL was much improved since then, the new features were not implemented in Redshift. Many basic features are missing from it.
Primary keys can be declared but not enforced. Referential integrity (foreign keys) can be declared but not enforced. UNIQUE and CHECK constraints are not supported and cannot be declared.
IDENTITY can be declared on a column, and Redshift will put unique values into it. However: IDENTITY values in the newly inserted rows won’t be incremental or sequential. To implement a sequential number, you need to write your own custom code.
There are no stored procedures in Redshift. We are writing SQL script files, and then parsing and running them one statement at a time from a Python program. This also enabled us to implement execution-time error logging.
In SQL scripts, to check for the row count of affected rows, a complicated join query against some system tables or views has to be executed.
Data Control Language (DCL) does not exist. No statements like IF, WHILE, DO, RAISERROR, etc.
On performance of views… Views do not “pass-through” a query parameter which is a potential problem for performance.
When selecting against a view with the WHERE clause outside of the view, the inner query of the view will be executed first without consideration for the WHERE clause, and only then the WHERE clause will be applied.
Certain clauses of SQL work many times faster than other clauses. So be careful and test your statements for performance earlier rather than later, especially if working with a large data set.
There was a situation when DELETE FROM JOIN was unacceptably slow. Replacing JOIN with the USING clause made DELETE instantaneous.
Overall it serves all our aspects of data management like data cleaning, data manipulation, and data reporting on the cloud platform. We can create stored procedures and triggers in it very easily as all the options are self suggested in it. We can easily attach the results of ARS to the other tools as well for drawing the statistical results.
The support was great and helped us in a timely fashion. We did use a lot of online forums as well, but the official documentation was an ongoing one, and it did take more time for us to look through it. We would have probably chosen a competitor product had it not been for the great support
It has the most stable database. It smoothly supports [a] huge [amount of] data. Security-wise it is best among them. It is a very large community and has well-defined documentation, which can support [during the] implementation of the Oracle database. We can easily scale up servers whenever [we] change our requirements. Oracle12c is best for large-scale projects like banking and retails.
We evaluated [Amazon] Redshift vs BigQuery vs Amazon EMR, back in 2014. Back then BigQuery cost was slightly higher than that of [Amazon] Redshift price structure. Amazon EMR, needs lots more management (Admin tasks) and EMR is designed to be ephemeral and not designed to be a data store. [Amazon] Redshift was ideal with the price structure, performance and ROI[.]