Natural Language Instruction Optimization Strategies
Based on actual testing experience, the following methodology to improve the accuracy of SQL generation is provided:
- Structured Expression::
Use the "[action] + [object] + [condition] + [qualification]" template, for example
“statisticians(Action) order form(Object) Q4 of 2023 in the(Condition) Top 5 Sales by Province(qualified)" - Avoiding ambiguous terms::
Replace the vague "new users" with "last 50 users registered" to ensure that the tool generates the most recent "new users".ORDER BY register_time DESC LIMIT 50 - Fields are explicitly specified::
When specific fields are required, explicitly state "query user's id, name, and last_login time" rather than simply "look up user information". - Complex queries in steps::
For multi-table correlation queries, you can first understand the structure by "displaying the correlation fields of the orders and users tables", and then build a complete query.
When the results do not meet expectations, it is recommended to use the/modelSwitching different language models (e.g., from Gemini to GPT-4), there are differences in the understanding of natural language across models.
This answer comes from the articleDbRheo-CLI: Command-line tool for manipulating databases and analyzing data using natural languageThe































