Ampool 1.3.1 is the first general release of Ampool targeted towards users, who are looking for analytics from data generated from their Apps. It is designed to sit closer to applications and/or compute clusters, serving data for both analytics and low-latency applications.
What's New in 1.3.1¶
With this release, a new immutable data abstraction, FTable, is introduced. Following are the highlights of this release:
Core data store
- FTable: A new table type that supports Create, Append and Scan operations
- Local tier store for FTable to 'tier' data on Ampool cluster local disks when data volume exceeds available memory tier capacity.
- Support for HDFS as tier store for FTable to 'tier' data
- Support for S3 as tier store for FTable.
Interfaces & APIs
- CLI (MASH): Added support for FTable for basic operations - append & scan. Also, top-level commands are re-factored to use generic 'table' instead of using 'mtable'. For a complete list, please refer to Table Command Reference.
- Spark: Updated to use Scala 2.11 and Spark 2.1.0 versions respectively
- Hive: Added option to specify data for FTable, which is now the default table type
- Kafka: ampool-connect-kafka is a Kafka Sink-Connector for storing data from kafka (topics) to corresponding ampool table (MTable)
Besides, several internal enhancements were made to improve performance for insert, update & scan operations, and efficient data recovery for larger datasets.
Following packages are available for download from Ampool's (S3) website:
Ampool Base Package (ampool-1.3.1.tar.gz): Includes Ampool core (MTable, FTable, CoProcessors, Local DiskStore for recovery, etc) and core interfaces (MASH & Java API)
Ampool Compute Connectors:
Installing Ampool v1.3.1¶
Core Ampool Server & Locator¶
- Untar the binaries in a new installation directory. After extracting the contents from the package, you should see the following directory structure:
bin config docs examples lib tools
- For launching ampool services, start the command-line utility MASH (Memory Analytics Shell) by typing the following from the installed ampool directory (
$ <ampool-home>/bin/mash mash>
Type 'help' for a list of commands.
For a detailed explanation of ampool services and commands, please refer to the README within the main directory.
Untar the Ampool connector packages on the Ampool client nodes (ampool-spark-1.3.1.tar.gz, ampool-hive-1.3.1.tar.gz, ampool-connect-kafka-1.3.1.tar.gz)
Refer to README in the respective packages to install and use the Ampool connectors with Spark and Hive.
Upgrading to Ampool v1.3.1¶
Core Ampool Server & Locator¶
Stop the existing Ampool Server(s) and Locator(s) using Mash CLI
Untar the new Ampool core package and start the Ampool Server and Locators using new binaries.
Make sure provide the previous version's server and locator directories using --dir option in Mash CLI when starting the Server and Locators.
- Untar the newer verison of connector packages and use them in place of previous verion in the classpath when using Spark, Hive and Kafka with newer version of Ampool
|GEN-1047||Feature Request : MASH CLI command to export the MTable data_|
|GEN-1565||describe table for Ftable mentions MTable._|
|GEN-1541||ampool-core package has incorrect naming of jars._|
|GEN-1397||FTable: Support for Expiration policy for tier-0 and tier-1._|
|GEN-1521||Spark-Ampool connector: Support version Spark 2.1 and Scala 2.11._|
|GEN-1554||Disable the check for existence of same key within a batch._|
|GEN-1323||Getting the reference to MTable/FTable from Client Cache is costly, need to cache the instance._|
|GEN-1414||Kafka connector to injest data into Ampool MTable._|
|GEN-1499||Integrate SharedStore as tiere 1 store._|
|GEN-1199||MTable.checkAndPut() define and implement API behavior for not existing rowKey._|
|GEN-1392||FTable: Design and Develop HDFS as archive tier._|
|GEN-1479||TierStore : Store implementation for local disk and HDFS._|
|GEN-1393||FTable: Support (insert) time based partitioning for local disk (ORC) tier._|
|GEN-1153||Mash mscan on unordered MTable._|
|GEN-1333||FTable: Implement Hive Connector._|
|GEN-1235||For a persisted MTable, the number of records returned by a scan after cluster restart may differ from the number of records returned by scan before cluster restart.|
|GEN-1228||Scan operation may fail if a failover happens while scan is running.|
Known Issues & Limitations¶
|Issue Ref||Description||Workaround (if any)|
|GEN-1161||On local developer machines, if ampool services are started without specifying the host, it may bind to wifi address, which changes with moving location. In such scenarios, reconnecting to the locator from MASH fails.||Manually kill ampool (locator and server) processes and restart the services. Alternatively, specify localhost or stable network interface to bind these services.|
|Limitation||Coprocessor can not be called on an empty table.||Endpoint coprocessor execution is not supported on empty table, To check for empty table use, MTable.isEmpty() API.|
|GEN-1144||runExamples script with already running ampool cluster fails.||Run the runExamples script after stopping the ampool cluster.|
Versions & Compatibility¶
This distribution is based on Apache Geode release (1.0.0-incubating.M3). Following table summarizes the minimum versions supported for different connectors:
|Apache Hive||0.14.0, 1.2.1|
- Code examples: A set of code samples showcasing table and coprocessor API can be found under