Vertica vs RDBMS and noSQL database cases - McKAY brothers, multimedia emulation and support

About McKAY's blog

ads

Post Top Ad

Your Ad Spot

2023/06/01

Vertica vs RDBMS and noSQL database cases

 

Document/key-value databases is typically good for unstructured/"schemaless" data. Usually, the cases when you don't need to explicitly define your schema up front and can just include new fields without any ceremony. Most modern fiscal printers uses such way of storage in the client machine.

In recent years those kind of databases where a boom, previously we have been on the benealt of LevelDB but such engine was very unstable when crashing.

It's often very easy to scale out document/key-value databases. Just by more parts of same structures (knows as nodes) to replicate data to is one way to offer more scalability and offer more protection against data loss.

Otherwise, complex/dynamic queries/reporting are best served from an RDBMS. Often...

...

Introduction to Vertica DB and document based DB

Vertica breaks the database scenario for two things that are the first things technology managers really see, its level of data compression (both in data storage as well as in transmission) as well as the semi-free service what are offering. But all of those are great only if the company that implements their technology is already also a company of technology.

But what if the company trying to make new high data writing software with high volume data storage isn't primarily focused on technology and can't spend that much on such a thing? But part of the information is not so true at all, is not open source per se. Also we have to take into account today's world of compulsive political tension!

Also the DBMS like Percona mysql with xtradb and myrocks

You may think of Percona as a distributor who collects, coordinates and maintains patches and distributes an enhanced version of the MySQL server and now today, other databases like PostgreSQL and MongoDB like.

Inspired in RocksDB, myrocks is a Percona improved engine that permits to use SQL like request and client into the MySQL DBMS. With SSD database storage, this means less space used and a higher endurance of the storage over time, but lack of performance if the underlying storage hardware is not SSD, so RocksDB need several key OS+Hardware features to have results.

The library is maintained by the Facebook Database Engineering Team. It is a fork of Google's LevelDB optimized to exploit many CPU cores, and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads.

The XtraDB is InnoDB with steroids. The patches themselves stem from Google, Facebook and others. XtraDB scales better on massively parallel architectures and especially XtraDB is much better suited for write-heavy workload. Contrary to rocksDB its well suite to use in HDD or SDD, but in last will cut off the SDD storage life due workload. XtraDB includes features such as crash-safe replication, online backups, and hot backups that are not available in InnoDB.

However, it's also important to note that XtraDB is not compatible with some InnoDB features, such as full-text searching or spatial indexing.

Considerations

Considerations for Percona flavours

Percona XtraDB is compatible with InnoDB by default. You can read and write the same datafiles, and all SQL queries run exactly the same. You won’t even notice the difference in basics.

Percona PosggreSQL does not have many differences, just better integration and easy of usage for administrators. This cos PostgreSQL is so complex and so scalar already.

The improvements in Percona Mysql XtraDB are subtle. They are internal fixes to solve specific scaling bottlenecks. These bottlenecks don’t necessarily affect your applications or environment, in which case Percona XtraDB would function exactly like stock InnoDB. Some of the enhancements in Percona XtraDB proved themselves useful, so later versions of Oracle MySQL and MariaDB implemented and today XtraDB and InnoDB are ver similar.

  • Buffer pool mutex split into four sub-types of mutex, to reduce contention when you have a high number of concurrent clients.
  • Insert buffer options for max size and merge rate. Good when you have lots of indexes and a very high rate of insert/update/delete operations.
  • Adaptive hash may be split into multiple partitions. Good if you have a high number of threads running concurrent queries over non-primary indexes, so much that it’s causing contention on the Adaptive Hash Index mutex.
  • Faster page checksum algorithm. Good if you have a high rate of page flushes on SSD storage. This feature is obsolete in MySQL 5.6.
  • Handle corrupt tables by issuing a warning and marking the table unusable, instead of the default behavior of deliberately crashing the MySQL server.

Considerations for Vertica DB

The biggest advantage of Vertica is the raw speed. It is extremely fast when compared to other analytical databases, and it boasts features that make joins extremely fast. The trade-offs are that is not focused in relational or not relational (check explanation), the high license costs, slow updates and insatiable need to eat thru tons of disk space.

Vertica never overwrite the data file on updates, so every time you update and new SO write will happen. This is a cons of the key-value storage philosophy, and for a relational base data this is not the good option.

Vertica is not to be used to replace your relational data at OLTP database, this is there to do the heavy lifting and help you do the analytics stuff with less expense(time, money). Vertica its more for BI rather for information storage.

Vertica have compressed data and encoded data, the compressed data will require some extra cpu cycles while retrieved, but most of the times Vertica uses encoding as following source describes http://www.aodba.com/tut_output_mysql.php?tut=6&page=vertica encoding creates a smaller footprints and by doing this data retrieval will be faster.

For a storage inventary software or product book is not the right choose. Vertica is a BI tool basically.

Vertica is extremely expensive, it not opensource, the opensource version does not have all of this key features; prices it started with about 100K$ per terabyte, and after it became popular now it costs 10K$ per terabyte, still this is expensive, a fully protected server is worth less than 500$ per month including several terabytes in Germany.

The lie of Vertica open source

Community version will only allow you to create a 3 nodes cluster and a max of 1TB (no bkp available and others things are as well not possible)

Just to put some light into the Vertica licenses

  • All development,homologation or desaster sites (they are under the initial production License – no extra money)
  • Raw data size based (valid to store up to some amount of raw data), license is only applied to the RAW data loaded once loaded it can be replicated as many times as you like.
  • Term-based and peridicaly (valid until a specific date, montly by example).

Conclusions

SQL performs well as a transaction processing system, it works horrible when trying to query it for reporting/analytic purposes. That is the case for Vertica, and all the column store capable (cos Vertica is primary RDBMS, not column/key based) has advantages but not as sustitution for.

You should choose Vertica if you need reporting and consulting features, mostly those need at business intelligence, but not like PIM software, product manager database or storage time.

This means any column/key based DB is just complementary to, not substitute to, and is the reason why Vertica really still is a RDBMS and not column-key only DB.

No hay comentarios.:

Publicar un comentario

no stupid winbuntu users allowed!

Entradas populares

Post Top Ad

Your Ad Spot