Accessing Hundreds of RDS Databases with AD Credentials Using Python and JDBC

Matthew Keeley
August 5, 2023

Introduction

During my recent engagement, I faced an interesting challenge of accessing hundreds (956 to be exact) of RDS databases at scale. The task was further complicated by the fact that I could only use Active Directory credentials to access these databases since they were attached to a network domain. Additionally, I had a hard time finding any concise guides on using Python to access MSSQL RDS databases online with JDBC. Given the complexity of the task and the lack of existing resources, I decided to document my solution in this blog post in hopes of aiding others who may find themselves in a similar predicament.

Example RDS database on network domain corp.prodefense.io

The Challenge

First, let’s set up the challenge at hand: we have hundreds of RDS databases with varying locations and names, and we need to access all of them using our corporate Active Directory credentials. To keep it platform-independent, we’ll use JDBC, a database access protocol that allows us to connect to a database in a platform-independent manner.

The Code

The Python library jaydebeapi is an interface to Java JDBC that allows Python programs to use JDBC to connect to databases. Using this, we can connect to each of our databases in a scalable manner. Here is the script that helped me get through this situation:

import jaydebeapi, os, ast
from dotenv import load_dotenv

# SQL Server credentials
load_dotenv()
username = os.getenv("AD_USERNAME")
password = os.getenv("AD_PASSWORD")
jar = os.getenv("JDBC_JAR_LOCATION")

if username is None or password is None or jar is None:
    print('Missing .env information')

with open('databases.txt', 'r') as f:
# database file contains databases like so: rds_url|db_name
    for line in f.readlines():
        if not line.startswith("#"):
            # Parse the databases.txt file
            db_info = line.split('|')
            database_location = db_info[0]
            database_name = db_info[1].strip('\n')
            # (rest of the code continues)

The snippet above is primarily used for the setup. In the .env file, the user should store their active directory credentials and the file path of the JDBC jar which can be found here.

Now, let’s see how we can connect to these databases and execute a query:

# Connect to the server
try:
    print(f'Connecting to server {database_location}...')
    connection_string = f'jdbc:sqlserver://;serverName={database_location};databaseName={database_name};authenticationScheme=NTLM;integratedSecurity=true;'
    conn = jaydebeapi.connect("com.microsoft.sqlserver.jdbc.SQLServerDriver", connection_string, [username, password], jar)
    curs = conn.cursor()

    try:
        # Execute a query
        query = "SELECT @@VERSION;"
        print(f"Sending query '{query}' to database: {database_name}...")
        curs.execute(query)

        # Print the SQL results
        rows = curs.fetchall()
        response = "\n".join(str(row) for row in rows)
        evaluated_response = ast.literal_eval(f'"{response}"')
        print(evaluated_response)

    except jaydebeapi.Error as e:
        print(f"Error occurred while executing the query: {str(e)}")

    finally:
        # Close the cursor
        curs.close()
        # Close the connection
        conn.close()

except Exception as ex:
    print(f"Cannot connect to server {database_location}...")

Since my laptop wasn't connected to the network domain, I couldn't use windows authentication to connect to the RDS databases. Instead I ended up using NTLM authentication (line 4) which worked perfectly.

Error handling is also in place for any issues that may occur while executing the query or connecting to the server. Since we are using a JAR file for connecting to the databases, the JAR can throw its own errors which need to be caught separately.

The rest of the code initiates a connection to each database server and sends a simple SELECT @@VERSION; query to test the connection. Any SQL results are fetched and printed out.

Conclusion

In conclusion, the lack of online resources does not deter a determined programmer. With a combination of Python, JDBC, and a well-structured approach, I was able to effectively access hundreds of RDS databases at scale using Active Directory credentials. I hope this solution serves as a useful guide for anyone facing a similar situation or simply trying to understand the interaction between Python, JDBC, and MS SQL RDS databases. Remember, every problem is just an opportunity for a creative solution!

Subscribe to our blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Matthew Keeley
February 14, 2023

Check out our other posts