This class is a representation of a single scan.
More...
#include <client.h>
This class is a representation of a single scan.
- Note
- This class is not thread-safe, though different scanners on different threads may share a single KuduTable object.
Whether the rows should be returned in order.
This affects the fault-tolerance properties of a scanner.
Enumerator |
---|
UNORDERED |
Rows will be returned in an arbitrary order determined by the tablet server. This is efficient, but unordered scans are not fault-tolerant and cannot be resumed in the case of tablet server failure.
This is the default mode.
|
ORDERED |
Rows will be returned ordered by primary key. Sorting the rows imposes additional overhead on the tablet server, but means that scans are fault-tolerant and will be resumed at another tablet server in the case of a failure.
|
The read modes for scanners.
Enumerator |
---|
READ_LATEST |
When READ_LATEST is specified the server will always return committed writes at the time the request was received. This type of read does not return a snapshot timestamp and is not repeatable.
In ACID terms this corresponds to Isolation mode: "Read Committed"
This is the default mode.
|
READ_AT_SNAPSHOT |
When READ_AT_SNAPSHOT is specified the server will attempt to perform a read at the provided timestamp. If no timestamp is provided the server will take the current time as the snapshot timestamp. In this mode reads are repeatable, i.e. all future reads at the same timestamp will yield the same data. This is performed at the expense of waiting for in-flight transactions whose timestamp is lower than the snapshot's timestamp to complete, so it might incur a latency penalty. See KuduScanner::SetSnapshotMicros() and KuduScanner::SetSnapshotRaw() for details.
In ACID terms this, by itself, corresponds to Isolation mode "Repeatable
Read". If all writes to the scanned tablet are made externally consistent, then this corresponds to Isolation mode "Strict-Serializable".
- Note
- There are currently "holes", which happen in rare edge conditions, by which writes are sometimes not externally consistent even when action was taken to make them so. In these cases Isolation may degenerate to mode "Read Committed". See KUDU-430.
|
READ_YOUR_WRITES |
When READ_YOUR_WRITES is specified, the client will perform a read such that it follows all previously known writes and reads from this client. Specifically this mode: (1) ensures read-your-writes and read-your-reads session guarantees, (2) minimizes latency caused by waiting for outstanding write transactions to complete.
Reads in this mode are not repeatable: two READ_YOUR_WRITES reads, even if they provide the same propagated timestamp bound, can execute at different timestamps and thus return different results.
|
kudu::client::KuduScanner::KuduScanner |
( |
KuduTable * |
table | ) |
|
|
explicit |
Constructor for KuduScanner.
- Parameters
-
[in] | table | The table to perfrom scan. The given object must remain valid for the lifetime of this scanner object. |
Add a predicate for the scan.
- Parameters
-
[in] | pred | Predicate to set. The KuduScanTokenBuilder instance takes ownership of the parameter even if a bad Status is returned. Multiple calls of this method make the specified set of predicates work in conjunction, i.e. all predicates must be true for a row to be returned. |
- Returns
- Operation result status.
Add an upper bound (exclusive) primary key for the scan.
If any bound is already added, this bound is intersected with that one.
- Parameters
-
[in] | key | The key to setup the upper bound. The scanner makes a copy of the parameter, the caller may free it afterward. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::AddExclusiveUpperBoundPartitionKeyRaw |
( |
const Slice & |
partition_key | ) |
|
Add an upper bound (exclusive) partition key for the scan.
- Note
- This method is unstable, and for internal use only.
- Parameters
-
[in] | partition_key | The scanner makes a copy of the parameter, the caller may invalidate it afterward. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::AddExclusiveUpperBoundRaw |
( |
const Slice & |
key | ) |
|
Add an upper bound (exclusive) primary key for the scan.
- Deprecated:
- Use AddExclusiveUpperBound() instead.
- Parameters
-
[in] | key | The encoded primary key is an opaque slice of data. |
- Returns
- Operation result status.
Add a lower bound (inclusive) primary key for the scan.
If any bound is already added, this bound is intersected with that one.
- Parameters
-
[in] | key | Lower bound primary key to add. The KuduScanTokenBuilder instance does not take ownership of the parameter. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::AddLowerBoundPartitionKeyRaw |
( |
const Slice & |
partition_key | ) |
|
Add a lower bound (inclusive) partition key for the scan.
- Note
- This method is unstable, and for internal use only.
- Parameters
-
[in] | partition_key | The scanner makes a copy of the parameter: the caller may invalidate it afterward. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::AddLowerBoundRaw |
( |
const Slice & |
key | ) |
|
Add lower bound for the scan.
- Deprecated:
- Use AddLowerBound() instead.
- Parameters
-
[in] | key | The primary key to use as an opaque slice of data. |
- Returns
- Operation result status.
void kudu::client::KuduScanner::Close |
( |
| ) |
|
Close the scanner.
Closing the scanner releases resources on the server. This call does not block, and will not ever fail, even if the server cannot be contacted.
- Note
- The scanner is reset to its initial state by this function. You'll have to re-add any projection, predicates, etc if you want to reuse this object.
Get the KuduTabletServer that is currently handling the scan.
More concretely, this is the server that handled the most recent Open() or NextBatch() RPC made by the server.
- Parameters
-
[out] | server | Placeholder for the result. |
- Returns
- Operation result status.
KuduSchema kudu::client::KuduScanner::GetProjectionSchema |
( |
| ) |
const |
- Returns
- Schema of the projection being scanned.
const ResourceMetrics& kudu::client::KuduScanner::GetResourceMetrics |
( |
| ) |
const |
- Returns
- Cumulative resource metrics since the scan was started.
bool kudu::client::KuduScanner::HasMoreRows |
( |
| ) |
const |
Check if there may be rows to be fetched from this scanner.
- Returns
true
if there may be rows to be fetched from this scanner. The method returns true
provided there's at least one more tablet left to scan, even if that tablet has no data (we'll only know once we scan it). It will also be true
after the initially opening the scanner before NextBatch is called for the first time.
Status kudu::client::KuduScanner::KeepAlive |
( |
| ) |
|
Keep the current remote scanner alive.
Keep the current remote scanner alive on the Tablet server for an additional time-to-live. This is useful if the interval in between NextBatch() calls is big enough that the remote scanner might be garbage collected. The scanner time-to-live can be configured on the tablet server via the –scanner_ttl_ms configuration flag and has a default of 60 seconds.
This does not invalidate any previously fetched results.
- Returns
- Operation result status. In particular, this method returns a non-OK status if the scanner was already garbage collected or if the TabletServer was unreachable, for any reason. Note that a non-OK status returned by this method should not be taken as indication that the scan has failed. Subsequent calls to NextBatch() might still be successful, particularly if the scanner is configured to be fault tolerant.
Status kudu::client::KuduScanner::NextBatch |
( |
std::vector< KuduRowResult > * |
rows | ) |
|
Get next batch of rows.
Clears 'rows' and populates it with the next batch of rows from the tablet server. A call to NextBatch() invalidates all previously fetched results which might now be pointing to garbage memory.
- Deprecated:
- Use NextBatch(KuduScanBatch*) instead.
- Parameters
-
[out] | rows | Placeholder for the result. |
- Returns
- Operation result status.
Fetch the next batch of results for this scanner.
A single KuduScanBatch object may be reused. Each subsequent call replaces the data from the previous call, and invalidates any KuduScanBatch::RowPtr objects previously obtained from the batch.
- Parameters
-
[out] | batch | Placeholder for the result. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::Open |
( |
| ) |
|
- Returns
- Result status of the operation (begin scanning).
Status kudu::client::KuduScanner::SetBatchSizeBytes |
( |
uint32_t |
batch_size | ) |
|
Set the hint for the size of the next batch in bytes.
- Parameters
-
[in] | batch_size | The hint of batch size to set. If setting to 0 before calling Open(), it means that the first call to the tablet server won't return data. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetCacheBlocks |
( |
bool |
cache_blocks | ) |
|
Set the block caching policy.
- Parameters
-
[in] | cache_blocks | If true , scanned data blocks will be cached in memory and made available for future scans. Default is true . |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetFaultTolerant |
( |
| ) |
|
Make scans resumable at another tablet server if current server fails.
Scans are by default non fault-tolerant, and scans will fail if scanning an individual tablet fails (for example, if a tablet server crashes in the middle of a tablet scan). If this method is called, scans will be resumed at another tablet server in the case of failure.
Fault-tolerant scans typically have lower throughput than non fault-tolerant scans. Fault tolerant scans use READ_AT_SNAPSHOT
mode: if no snapshot timestamp is provided, the server will pick one.
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetLimit |
( |
int64_t |
limit | ) |
|
Set the maximum number of rows the scanner should return.
- Parameters
-
[in] | limit | Limit on the number of rows to return. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetProjectedColumnIndexes |
( |
const std::vector< int > & |
col_indexes | ) |
|
Set the column projection by passing the column indexes to read.
Set the column projection used for this scanner by passing the column indices to read. A call to this method overrides any previous call to SetProjectedColumnNames() or SetProjectedColumnIndexes().
- Parameters
-
[in] | col_indexes | Column indices for the projection. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetProjectedColumnNames |
( |
const std::vector< std::string > & |
col_names | ) |
|
Set the projection for the scanner using column names.
Set the projection used for the scanner by passing column names to read. This overrides any previous call to SetProjectedColumnNames() or SetProjectedColumnIndexes().
- Parameters
-
[in] | col_names | Column names to use for the projection. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetProjectedColumns |
( |
const std::vector< std::string > & |
col_names | ) |
|
Set the ReadMode. Default is READ_LATEST
.
- Parameters
-
[in] | read_mode | Read mode to set. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetRowFormatFlags |
( |
uint64_t |
flags | ) |
|
Optionally set row format modifier flags.
If flags is RowFormatFlags::NO_FLAGS, then no modifications will be made to the row format and the default will be used.
Some flags require server-side server-side support, thus the caller should be prepared to handle a NotSupported status in Open() and NextBatch().
Example usage (without error handling, for brevity):
* scanner.SetRowFormatFlags(row_format_flags);
* scanner.Open();
* while (scanner.HasMoreRows()) {
* KuduScanBatch batch;
* scanner.NextBatch(&batch);
* Slice direct_data = batch.direct_data();
* Slice indirect_data = batch.indirect_data();
* ...
* }
*
- Parameters
-
[in] | flags | Row format modifier flags to set. |
- Returns
- Operation result status.
Set the replica selection policy while scanning.
- Parameters
-
[in] | selection | The policy to set. |
- Returns
- Operation result status.
- Todo:
- Kill this method in favor of a consistency-level-based API.
Status kudu::client::KuduScanner::SetSnapshotMicros |
( |
uint64_t |
snapshot_timestamp_micros | ) |
|
Set snapshot timestamp for scans in READ_AT_SNAPSHOT
mode.
- Parameters
-
[in] | snapshot_timestamp_micros | Timestamp to set in in microseconds since the Epoch. |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetSnapshotRaw |
( |
uint64_t |
snapshot_timestamp | ) |
|
Set snapshot timestamp for scans in READ_AT_SNAPSHOT
mode (raw).
- Note
- This method is experimental and will either disappear or change in a future release.
- Parameters
-
[in] | snapshot_timestamp | Timestamp to set in raw encoded form (i.e. as returned by a previous call to a server). |
- Returns
- Operation result status.
Status kudu::client::KuduScanner::SetTimeoutMillis |
( |
int |
millis | ) |
|
Set the maximum time that Open() and NextBatch() are allowed to take.
- Parameters
-
[in] | millis | Timeout to set (in milliseconds). Must be greater than 0. |
- Returns
- Operation result status.
std::string kudu::client::KuduScanner::ToString |
( |
| ) |
const |
- Returns
- String representation of this scan.
const uint64_t kudu::client::KuduScanner::NO_FLAGS = 0 |
|
static |
Modifier flags for the row format returned from the server.
- Note
- Each flag corresponds to a bit that gets set on a bitset that is sent to the server. See SetRowFormatFlags() for example usage.
const uint64_t kudu::client::KuduScanner::PAD_UNIXTIME_MICROS_TO_16_BYTES = 1 << 0 |
|
static |
Makes the server pad UNIXTIME_MICROS slots to 16 bytes.
- Note
- This flag actually wastes throughput by making messages larger than they need to be. It exists merely for compatibility reasons and requires the user to know the row format in order to decode the data. That is, if this flag is enabled, the user must use KuduScanBatch::direct_data() and KuduScanBatch::indirect_data() to obtain the row data for further decoding. Using KuduScanBatch::Row() might yield incorrect/corrupt results and might even cause the client to crash.
The documentation for this class was generated from the following file: