Interface SnowflakeStreamingIngestChannel
- All Superinterfaces:
AutoCloseable
-
Method Summary
Modifier and TypeMethodDescriptionvoidInsert one row into the channel, the row is represented using a Map which maps Column Name to Column Value for the columns in the table.voidInsert a batch of rows into the channel, each row is represented using a Map which maps Column Name to Column Value for the columns in the table.voidclose()Close the channel, this function will make sure all the data in this channel is flushed to the snowflake sever sidevoidClose the channel.Get the channel nameGet the channel status for the current channel from Snowflake.Get the database nameGet the fully qualified channel nameGet the fully qualified pipe name that the channel belongs toGet the latest committed offset token for the current channel in Snowflake.Get the pipe nameGet the schema namevoidInitiates a flush of all buffered data maintained for this Channel but does not wait for the flush to complete.booleanisClosed()Check if the channel is closedwaitForCommit(Predicate<String> tokenChecker, Duration timeoutDuration) Asynchronously waits for offset token to be committed in the snowflake sever side by checking whether the latest committed offset token meets the commit condition provided by the tokenChecker.waitForFlush(Duration timeoutDuration) Asynchronously waits for all buffered data in this channel to be flushed to the Snowflake server side.
-
Method Details
-
close
void close()Close the channel, this function will make sure all the data in this channel is flushed to the snowflake sever side- Specified by:
closein interfaceAutoCloseable
-
close
Close the channel. If waitForFlush=true this function will make sure all the data in this channel is flushed to the snowflake sever side or return a timeout error if the timeout is reached.- Parameters:
waitForFlush- whether to wait for all data to be flushedtimeoutDuration- The maximum time for close to wait- Throws:
TimeoutException- if the timeout is reached before the flush completes
-
appendRow
Insert one row into the channel, the row is represented using a Map which maps Column Name to Column Value for the columns in the table. The following table summarizes supported value types and their formats:Supported Value Types and Formats for Ingestion Columns Snowflake Column Type Allowed Java Data Type CHAR, VARCHAR, TEXT, STRING - String
- primitive data types (int, boolean, char, …)
BINARY - byte[]
- String (hex-encoded)
NUMBER, FLOAT - numeric types (BigInteger, BigDecimal, byte, int, double, …)
- String
BOOLEAN - boolean
- numeric types (BigInteger, BigDecimal, byte, int, double, …)
- String
TIME LocalTimeOffsetTime-
String (in one of the following formats):
DateTimeFormatter.ISO_LOCAL_TIMEDateTimeFormatter.ISO_OFFSET_TIME- Integer-stored time (see Snowflake Docs for more details)
DATE LocalDateLocalDateTimeOffsetDateTimeZonedDateTimeInstant-
String (in one of the following formats):
DateTimeFormatter.ISO_LOCAL_DATEDateTimeFormatter.ISO_LOCAL_DATE_TIMEDateTimeFormatter.ISO_OFFSET_DATE_TIMEDateTimeFormatter.ISO_ZONED_DATE_TIME- Integer-stored date (see Snowflake Docs for more details)
TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ LocalDateLocalDateTimeOffsetDateTimeZonedDateTimeInstant-
String (in one of the following formats):
DateTimeFormatter.ISO_LOCAL_DATEDateTimeFormatter.ISO_LOCAL_DATE_TIMEDateTimeFormatter.ISO_OFFSET_DATE_TIMEDateTimeFormatter.ISO_ZONED_DATE_TIME- Integer-stored timestamp (see Snowflake Docs for more details)
For TIMESTAMP_LTZ and TIMESTAMP_TZ, all input without timezone will be by default interpreted in the timezone "America/Los_Angeles".
VARIANT, ARRAY - String (must be a valid JSON value)
- primitive data types and their arrays
- BigInteger, BigDecimal
LocalDateLocalDateTimeOffsetDateTimeZonedDateTime- Map<String, T> where T is a valid VARIANT type
- T[] where T is a valid VARIANT type
- List<T> where T is a valid VARIANT type
OBJECT - String (must be a valid JSON object)
- Map<String, T> where T is a valid variant type
GEOGRAPHY, GEOMETRY String - Parameters:
row- object data to write. For predictable results, we recommend not to concurrently modify the input row data.offsetToken- offset of given row, used to track the ingestion progress and replay ingestion in case of failures. It could be null if you don't plan on replaying or can't replay. Please see https://docs.snowflake.com/en/user-guide/data-load-snowpipe-streaming-overview#offset-tokens for more details.
-
appendRows
void appendRows(@Nonnull Iterable<Map<String, Object>> rows, @Nullable String startOffsetToken, @Nullable String endOffsetToken) Insert a batch of rows into the channel, each row is represented using a Map which maps Column Name to Column Value for the columns in the table. SeeappendRow(Map, String)for more information about accepted values.- Parameters:
rows- object data to writestartOffsetToken- start offset of the batch/row-set. SeeappendRow(Map, String)for more information about offset tokens.endOffsetToken- end offset of the batch/row-set. SeeappendRow(Map, String)for more information about offset tokens.
-
getLatestCommittedOffsetToken
Get the latest committed offset token for the current channel in Snowflake.- Returns:
- the latest committed offset token
-
getChannelStatus
ChannelStatus getChannelStatus()Get the channel status for the current channel from Snowflake.- Returns:
- the channel status
-
waitForCommit
Asynchronously waits for offset token to be committed in the snowflake sever side by checking whether the latest committed offset token meets the commit condition provided by the tokenChecker. Note that snowflake commits offset token in batch, so the tokenChecker should be able to handle the case where the latest committed offset token passed the expected ones. That said, the tokenChecker usually does a range check whether the provided token is greater or equal to the expected one, not a exact match.Cancellation: The returned
CompletableFuturecan be cancelled, which will stop further polling iterations and complete the future exceptionally.- Parameters:
tokenChecker- A predicate that tests whether the current committed offset token from the server meets the desired condition. The predicate receives the latest committed offset token (which may benull) and should returntruewhen the wait condition is satisfied.timeoutDuration- The maximum time to wait for the condition to be met.- Returns:
- A
CompletableFuture<Boolean>that completes with:trueif the commit condition was satisfied within the timeout periodfalseif the timeout was reached before the condition was satisfied
- Throws:
IllegalArgumentException- iftimeoutInSecondsis negative, or iftokenCheckerisnull
-
waitForFlush
Asynchronously waits for all buffered data in this channel to be flushed to the Snowflake server side. This method triggers a flush of all pending data and waits for the flush operation to complete.- Parameters:
timeoutDuration- The maximum time to wait for the flush to complete. If null, the operation will wait indefinitely.- Returns:
- A
CompletableFuture<Boolean>that completes with:trueif the commit condition was satisfied within the timeout periodfalseif the timeout was reached before the condition was satisfied
- Throws:
IllegalArgumentException- iftimeoutDurationis negative
-
initiateFlush
void initiateFlush()Initiates a flush of all buffered data maintained for this Channel but does not wait for the flush to complete. Calls to appendRows are still allowed on the Channel after invoking this API.This method triggers an immediate flush of all currently buffered data in this specific channel, similar to the client-level
initiateFlush()but scoped to only this channel. The flush operation will occur asynchronously and this method returns immediately.Usage: This method is useful when you want to force immediate transmission of buffered data without waiting for automatic flush triggers (time-based or size-based). It provides fine-grained control over when data gets sent to Snowflake on a per-channel basis. However, calling initiateFlush at a high rate will lead to a drop in overall throughput, potential increase in costs, and could lead to higher incidence of throttling by the Snowflake Service.
-
isClosed
boolean isClosed()Check if the channel is closed- Returns:
- a boolean which indicates whether the channel is closed
-
getChannelName
String getChannelName()Get the channel name- Returns:
- the channel name
-
getDBName
String getDBName()Get the database name- Returns:
- name of the database
-
getSchemaName
String getSchemaName()Get the schema name- Returns:
- name of the schema
-
getPipeName
String getPipeName()Get the pipe name- Returns:
- name of the pipe
-
getFullyQualifiedPipeName
String getFullyQualifiedPipeName()Get the fully qualified pipe name that the channel belongs to- Returns:
- fully qualified pipe name, in the format of dbName.schemaName.pipeName
-
getFullyQualifiedChannelName
String getFullyQualifiedChannelName()Get the fully qualified channel name- Returns:
- fully qualified name of the channel, in the format of dbName.schemaName.pipeName.channelName
-