Spotlight on Babelfish for Aurora PostgreSQL

04 February,2022 by Rambler

-Translation Service between SQL Server & Aurora PostgreSQL, more specifically : Amazon Aurora PostgreSQL-Compatible Edition that enables Aurora to understand commands from applications written for Microsoft SQL Server.

-What is Bablefish all about ?  It is all about Commercial migration . Normally you would look at Schema Conversion Tool (SCT) to figure out how to convert the schema , use DMS to migrate the databases and then start the hard part of changing the code. What Babelfish does is not only add compatibility with T-SQL but also adds the SQL Wire Protocol . The Babelfish project is  not trying to replicate SQL Server but try to view it as an acceleration mechanism to get you onto PostgreSQL

i.e you can use your SQL Server app to talk to Postgres via the TDS protocol . Exploiting the Postgres extension mechanism . It Postgres but Combing t-sql and driver language

You can still do the SCT\DMS combination but that would require reworking upfront

You now have an alternative which supports a partial migration - i.e bring over things as if they were SQL Server, and now you can finish the migration at a more leisurely pace. 

Opensource available on Github . In theory - the community will add t-sql features & AWS will pull back into Aurora 

 

Babelfish

      

-In the words of the Amazon AWS press release Babelfish “adds an endpoint to PostgreSQL that understands the SQL Server wire protocol Tabular Data Stream (TDS), as well as commonly used T-SQL commands used by SQL Server. Support for T-SQL includes elements such as the SQL dialect, cursors, catalog views, data types, triggers, stored procedures, and functions”

In technical terms, Babelfish provides native, semantically correct execution of T-SQL over the TDS wire protocol, by default on port 1433, plus native SQL Server data types and runtime library support. 

It achieves this by the using protocol hooks. A hook is defined as a function pointer that by default is set to NULL. When the hook's pointer  is set to a non-NULL value , the hook invokes different server behaviour , without having to make changes to the core PostgreSQL community code.

-Use cases: Migrating SQL Server to Aurora Postgres , Babelfish will read\understand T-SQL 

- The strategic future direction is interesting , AWS have open sourced Babel but will the PostgreSQL community integrate Babelfish as part of the development core or will it remain as a AWS fork?

-Still rely on a conversion of schema and data to Postgres

 

SetUp Consideration 

-Minimum version  Aurora PostgreSQL (Compatible with PostgreSQL 13.4) 

-Single -Mode versus Multiple - mode ?   This will impact how many user database can co-exist on a single Babelfish Instance . If you single-mode , you will be able to have a single user database , if you attempt to create more than 1 database , the following message will appear 

Only one user database allowed under single-db mode. User database "xxxxxxx" already exists

The other impact is that the  SQL Server schema names remain the same in the babelfish db .    In comparison , if you were to choose multiple mode , although you will see the schema names through SQL Server as the same , they will be referenced as dbname_schemaname  when viewed through PostgreSQL

Differences between Aurora PostgreSQL with Babelfish and SQL Server

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-compatibility.html

Creating an Aurora PostgreSQL cluster with Babelfish

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-create.html

Connecting to a DB cluster with Babelfish turned on

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-connect.html

Configuring a babelfish client

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-connect-configure.html

When the Aurora PostgreSQL instance is set up with babelfish setup , proceed with a checkout:

1) Checkout 1 : Connect via a PostgreSQL endpoint . I use pgAdmin to connect , 

postgres=> SELECT version();

 

2)Checkpoint 2 : Connect via the SQL Server endpoint , by default 1433.  Although most security best practices recommend changing from 1433 to another port 

Example :

sqlcmd -S mybabelfish.cxxxxxxx.us-xxxxxxxx.amazonaws.com,myPortNumber -U sa -P myPassword

1> select @@version;

Babelfish for Aurora Postgres with SQL Server Compatibility - 12.0.2000.8
Oct 13 2021 17:34:47
Copyright (c) Amazon Web Services
PostgreSQL 13.4 on aarch64-unknown-linux-gnu

If you can't connect - check the error message 

Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online..

 

A common problem is the access rules enforced by your local firewall and the IP addresses authorized to access your DB instance might not match. The problem is most likely the inbound rules in your security group.   By default - the Babelfish is set up using the default Parameter groups , specifying the babelfishpg_tds.port value as 1433.   If your firewall rules allow 1433 , than the problem is related to another issue - but it is typical that organisations will block 1433 as  part of any regular security lockdown. 

When AWS RDS process builds an RDS instance it extracts the default values and creates a custom Parameter group titled similar to custom-aurora-postgresql13-babelfish-compat-xx. Within the Parameter group  look for the parameter babelfishpg_tds.port and edit the value to the port number configured for the firewall 

Configuring a database for Babelfish

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/babelfish-configuration.html

 

Migrate from SQL Server to Amazon Aurora using Babelfish

https://aws.amazon.com/blogs/database/migrate-from-sql-server-to-amazon-aurora-using-babelfish/

 

The overall objective is  to connect to an  Amazon Aurora PostgreSQL cluster using a postgreSQL client app and execute queries again a postgres DB instance.  Then , connect to the same cluster using the Babelfish capability of Amazon aurora with PostgreSQL-compatibility edition and execute cute  queries (DDL & DML). 

In this quick demo I'll show how  a script  currently running against a  SQL Server can also run directly on Amazon Aurora with PostgreSQL-Compatibility edition with significantly less work required, compared to a standard migration. Babelfish is certified with the SQL Server wire-protocol (TDS) and T-SQL,  so no need to switch database drivers or re-write all of your application queries. Turning Babelfish on supports  SQL Server-based applications to point to the Babelfish TDS port on an Aurora PostgreSQL cluster 

In pgAdmin - connected to the babelfish instance via the postgreSQL port ,  list all the available databases 

SELECT datname FROM pg_database WHERE datistemplate = false;

1.rdsadmin
2.postgres
3.babelfish_db

In sqlcmd   , connected via the TDS port list out the available databasaes

1> select name from sys.databases

master
tempdb

In sqlcmd , create a new user database & then create some objects such as a table

1> CREATE DATABASE myTestDB

1> use myTestDB
2> CREATE TABLE Countries (ID INT , country_name VARCHAR(100));
3> go
Changed database context to 'mytestdb'.

In sqlcmd , add some data to the table

insert into countries select 1, 'Tunisia';

insert into countries select 1, 'Mexico';

 

These objects are maintained on the babelfish_db   . This means it also possible to this metadata through the postgresql port , referencing the babelfish_db

use babelfish_db 

select * from sys.databases

master

tempdb

mytempdb

 

--Using a script - in this example I'll use Powershell ,but use any scripting language where you can connect to the db server. Python is another I use

The aim is to connect via the TDS i.e SQL Server 

-----------------------------START POWERSHELL SCRIPT-----------------------------------------------

$svr = "REPLACE_WITH_SERVER_NAME"
$db = "mytestdb"
$un = "REPLACE_WITH_USER_NAME"
$pw = "REPLACE_WITH_PASSWORD"

#datatable to store db results
$dt = new-object "System.Data.DataTable"

$cn = new-object System.Data.SqlClient.SqlConnection "Server=$svr;Database=$db;User ID=$un;Password=$pw"
$cn.Open()
$sql = $cn.CreateCommand()
$sql.CommandText = "select * from countries"
$rdr = $sql.ExecuteReader()
$dt.Load($rdr)
$cn.Close()

#display results
$dt| select * -ExcludeProperty RowError, RowState, HasErrors, Name, Table, ItemArray

-----------------------------FINISH POWERSHELL SCRIPT-----------------------------------------------


id country_name
-- ------------
1 Tunisia
2 Mexico

 

 


Author: Rambler (http://www.dba-ninja.com)


Share:

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment on Spotlight on Babelfish for Aurora PostgreSQL


dba-ninja.com