Purpose of the Article: As it is difficult to create a table like structure using json and xml schemes in snowflake, we made it easy using the inbuilt snowflake function created a template code which can be used to create table structure

Intended Audience: Snowflake Team

Tools and Technology: Snowflake

Keywords: Snowflake, database, table, jsonxml,schemas

Main Objective of the blog:-

The main objective of the blog is to convert a xml,json scheme file to tables in Snowflake database.

What is a JSON file?

JSON which is officially termed JavaScript object notation is the format for storing and exchanging data. The JSON file can also be a structure of the data or else a structure with data. Every JSON file has its JSON extension on its end.

Coming data inside a JSON file is created in the key: value format.

What is an XML file?

XML stands for EXTENSIBLE MARKUP LANGUAGE. Every XML file comes with a .xml extension at the end of the file name. These are also used to store data or just a structure of the table.

Why Snowflake?

We are creating tables in snowflake using JSON and XML files. snowflake is a data warehousing system with an automatic scaling process to reduce cost and increase performance.

WHY SNOWFLAKE?

We Use Snowflakes as a target here.

Snowflake is a data warehousing system. Which gets automatically scaled, both up and down, to get the right balance of performance vs. cost

Architecture

Creating a table in snowflake using json scheme

Step: -1

1.Creating a table using json

There are several methods to create a table using json scheme. One of the methods I am using here is, dumping the scheme as a single column into a table as a variant.

Step: -2

Creating a table structure with a single column

Step: -3

Inserting the json file into that column

Step: -4

Writing a code on top of that column to create a table with data

select

json_column:firstName::string as firstName,

json_column:lastName::string as lastName,

json_column:gender::string as gender,

json_column:age::int as age,

json_column:address.city::string as city,

json_column:address.state::string as state,

json_column:address.streetAddress::string as streetAddress,

f.value:number::string as number,

f.value:type::string as type

from xyz_table, table(flatten(json_column:phoneNumbers)) f;

Step: -5

As it is a select statement, we can see the data as a result.
we can also create a table for that data

Step: -6

The table got created and data loaded successfully

LOADING XML_DATA INTO SNOWFLAKE:

Step: -1

Creating a xmltable

Step: -2

Inserting the xml data into xml table

Step: -3

Loading the data into Snowflake

with cte as (

select XMLGET(value, ‘Address’ ) as “Address” ,

XMLGET(value, ‘FirstName’ ):”$” as “first name”,

XMLGET(value, ‘LastName’ ):”$” as “LastName”,

XMLGET(value, ‘ContactNo’ ):”$” as “ContactNo”,

XMLGET(value, ‘Email’ ):”$” as “Email”

FROM “SNOW_BLOG”.”PUBLIC”.”TREASURY_AUCTION_XML”,

LATERAL FLATTEN(to_array(treasury_auction_xml.src_xml:”$” ))auction_announcement)

select XMLGET(“Address”, ‘City’ ,0 ):”$” as “City”,

XMLGET(“Address”, ‘State’ ,0 ):”$” as “state”,

XMLGET(“Address”, ‘Zip’ ,0 ):”$” as “Zipcode”,”first name”,”LastName”,”ContactNo”,”Email”

–[except “Address”]

from cte;

  • XMLGET()
    XMLGet is used to extract entire entity nodes out of an XML document in a variant column. To refer to a specific XML node in SQL, you must first get a reference to it with XMLGet.
  • GET()
    Get is used to pull specific data out of a node reference that you established with XMLGet. A good analogy is reading your mail: XMLGet pulls a single envelope out of your mailbox, and Get opens that letter and reads it.  (Note: I’ll be showing you a shorthand notation that combines XMLGet and Get into a single call…)
  • FLATTEN()
    Flatten is a table function that takes a VARIANT, OBJECT, or ARRAY column and makes it function as if it were a table. Once you FLATTEN() a node that you’ve extracted with XMLGet, you can join it in SQL just like it was a regular table.

These three functions form the basis for working with XML in a variant column in Snowflake.

Author Bio:

Picture of Bhanu Prasad Alluvada

Bhanu Prasad Alluvada

Associate Software Engineer, Data Engineering - Analytics

I am A Bhanu Prasad. I have been working with MOURI Tech as an associate software engineer in Data Engineering Analytics Team for the past 1year and 9 months. My working tool is GCP. I love to explore my capabilities, and as my interest in python and snowflake is very keen, I wrote this blog combining both.

Purpose to Contact :
Purpose to Contact :
Purpose to Contact :

Purpose to Contact :
Purpose to Contact :
Purpose to Contact :

Purpose to Contact :