美文网首页
Big Data Pipeline Recipe

Big Data Pipeline Recipe

作者: allenhaozi | 来源:发表于2020-11-01 15:00 被阅读0次

    To summarize the databases and storage options outside of the Hadoop ecosystem to consider are:

    • Cassandra:
      NoSQL database that can store large amounts of data, provides eventual consistency and many configuration options.
      Great for OLTP but can be used for OLAP with pre computed aggregations (not flexible). An alternative is ScyllaDB which is much faster and better for OLAP (advanced scheduler)
    • YugaByteDB:
      Massive scale Relational Database that can handle global transactions. Your best option for relational data.

    • MongoDB: Powerful document based NoSQL database, can be used for ingestion(temp storage) or as a fast data layer for your dashboards

    • InfluxDB for time series data.

    • Prometheus for monitoring data.

    • ElasticSearch: Distributed inverted index that can store large amounts of data. Sometimes ignored by many or just used for log storage, ElasticSearch can be used for a wide range of use cases including OLAP analysis, machine learning, log storage, unstructured data storage and much more. Definitely a tool to have in your Big Data ecosystem.

    相关文章

      网友评论

          本文标题:Big Data Pipeline Recipe

          本文链接:https://www.haomeiwen.com/subject/qeayvktx.html