Skip to main content

SQLDatabaseChain

The SQLDatabaseChain allows answering natural language questions by querying a SQL database. It converts the question to SQL, runs against the database, and returns the result.

Installation and Setup

To use the chain, first install LangChain and SQLAlchemy:

pip install langchain sqlalchemy

Then create an SQLAlchemy engine and initialize the SQLDatabase utility:

from sqlalchemy import create_engine  
from langchain.utilities import SQLDatabase

engine = create_engine('sqlite:///mydatabase.db')
db = SQLDatabase(engine)

Basic Usage

Pass the SQLDatabase and an LLM to initialize the chain:

from langchain import OpenAI
from langchain.chains.sql_database import SQLDatabaseChain

llm = OpenAI()
chain = SQLDatabaseChain(llm, db)

result = chain("How many users are in the database?")
print(result)

The chain handles the full loop - translating the question to SQL, querying the database, and returning the result.

Customizing the Prompt

The prompt can be customized to provide the LLM more context on the database schema:

prompt = PromptTemplate(
input_variables=["question", "table_info"],
template="Answer the {question} using the database schema below: {table_info}"
)

table_info = "\n".join([table.ddl for table in db.metadata.tables.values()])

chain = SQLDatabaseChain(llm, db, prompt=prompt, table_info=table_info)

Table info and sample rows can also be configured in SQLDatabase.

Advanced Usage

For more control, return_intermediate_steps returns the generated SQL and query result:

chain = SQLDatabaseChain(llm, db, return_intermediate_steps=True)
result = chain("How many users are in the database?")
print(result.intermediate_steps)

The use_query_checker option attempts to fix invalid SQL before executing:

chain = SQLDatabaseChain(llm, db, use_query_checker=True) 

Conclusion

The SQLDatabaseChain provides a simple way to leverage large language models to query databases in natural language. With the ability to customize the prompt and output, it can be adapted to many use cases.