Overview of Snowflake Data Loading

Supported File Locations:

Snowflake refers to the location of data files in cloud storage as a stage. The COPY INTO <table> command used for both bulk and continuous data loads (i.e. Snowpipe) supports cloud storage accounts managed by your business entity (i.e. external stages) as well as cloud storage contained in your Snowflake account (i.e. internal stages).

External Stages: Loading data from any of the following cloud storage services is supported regardless of the cloud platform that hosts your Snowflake account:

Amazon S3
Google Cloud Storage
Microsoft Azure

Internal Stages: Snowflake maintains the following stage types in your account:

User: A user stage is allocated to each user for storing files. This stage type is designed to store files that are staged and managed by a single user but can be loaded into multiple tables. User stages cannot be altered or dropped.

Table: A table stage is available for each table created in Snowflake. This stage type is designed to store files that are staged and managed by one or more users but only loaded into a single table. Table stages cannot be altered or dropped. •Note that a table stage is not a separate database object; rather, it is an implicit stage tied to the table itself. A table stage has no grantable privileges of its own. To stage files to a table stage, list the files, query them on the stage, or drop them, you must be the table owner (have the role with the OWNERSHIP privilege on the table).

Named: A named internal stage is a database object created in a schema. This stage type can store files that are staged and managed by one or more users and loaded into one or more tables. Because named stages are database objects, the ability to create, modify, use, or drop them can be controlled using security access control privileges. Create stages using the CREATE STAGE command.

Bulk vs Continuous Loading:

Bulk Loading Using the COPY Command: This option enables loading batches of data from files already available in cloud storage, or copying (i.e. staging) data files from a local machine to an internal (i.e. Snowflake) cloud storage location before loading the data into tables using the COPY command.

Continuous Loading Using Snowpipe: This option is designed to load small volumes of data (i.e. micro-batches) and incrementally make them available for analysis. Snowpipe loads data within minutes after files are added to a stage and submitted for ingestion. This ensures users have the latest results, as soon as the raw data is available.

Loading Data from Apache Kafka Topics:

The Snowflake Connector for Kafka enables users to connect to an Apache Kafka server, read data from one or more topics, and load that data into Snowflake tables.

Alternatives to Loading Data:

It is not always necessary to load data into Snowflake before executing queries.

External Tables (Data Lake)

External tables enable querying existing data stored in external cloud storage for analysis without first loading it into Snowflake. The source of truth for the data remains in the external cloud storage. Data sets materialized in Snowflake via materialized views are read-only.

This solution is especially beneficial to accounts that have a large amount of data stored in external cloud storage and only want to query a portion of the data; for example, the most recent data. Users can create materialized views on subsets of this data for improved query performance. •

Overview of Snowflake Data Loading

Leave a comment Cancel reply

Azure SQL Edge

Microsoft Purview

Trending

Azure SQL Edge

Microsoft Purview

Anomaly Detector

Overview of Snowflake Data Loading

Share this:

Leave a comment Cancel reply

Trending

Edge Computing: Applications Across Industries and Future Innovations

Azure SQL Edge

Microsoft Purview

Anomaly Detector