Grig

Posted on Dec 29

Databasus showed an example how to use AI in large open source projects

#database #opensource #postgres

Open source projects are increasingly adopting AI tools in their development workflows. But when your project handles sensitive data like database backups, encryption keys and production environments, the "move fast and break things" approach isn't an option. Databasus, a backup tool for PostgreSQL, MySQL and MongoDB, recently published a detailed explanation of how they use AI — and it's a masterclass in responsible AI adoption for security-critical projects.

The AI transparency problem in open source

Many open source projects quietly integrate AI-generated code without disclosure. Some embrace "vibe coding" where AI writes entire features with minimal human verification. For projects handling cat photos or todo lists, this might be acceptable. But for tools managing production databases with sensitive customer data, the stakes are entirely different.

Databasus handles database credentials, backup encryption and access to production systems used by thousands of teams daily. A single security vulnerability could expose sensitive data across multiple organizations. This reality forced the maintainers to establish clear boundaries around AI usage and document them publicly.

How Databasus uses AI as a development assistant

The project treats AI as a verification and enhancement tool rather than a code generator. You can read their full AI usage policy in the project README or on the website FAQ. Here's their approach broken down into specific use cases:

AI Usage Type	Purpose	Human Oversight
Code review	Verify quality and search for vulnerabilities	Developer reviews all findings
Documentation	Clean up comments and improve clarity	Developer approves all changes
Development assistance	Suggest approaches and alternatives	Developer writes final implementation
PR verification	Double-check after human review	Developer makes final approval decision

This table shows that AI never operates autonomously in the development process. Every AI suggestion passes through developer verification before reaching the codebase. The maintainers explicitly state they use AI for "assistance during development" but never for "writing entire code" or "vibe code approach."

What Databasus doesn't use AI for

The project maintains strict boundaries around AI limitations. These restrictions protect code quality and security:

No autonomous code generation: AI doesn't write complete features or modules
No "vibe code": Quick AI-generated solutions without thorough understanding are rejected
No unverified suggestions: Every line of AI-suggested code requires human verification
No untested code: All code, regardless of origin, must have test coverage

This approach treats AI-generated code with the same skepticism as code from junior developers. Just because something compiles and appears to work doesn't mean it's production-ready. The project applies identical quality standards whether code comes from AI, external contributors or core maintainers.

Quality gates that catch bad code regardless of source

Databasus enforces code quality through multiple automated and manual checks. The project doesn't differentiate between poorly written human code and AI-generated code — both get rejected if they don't meet standards.

Quality Gate	Purpose	Applies To
Unit tests	Verify individual component behavior	All code changes
Integration tests	Test component interactions	All features
CI/CD pipeline	Automated testing and linting	Every commit
Human code review	Architecture and security verification	All pull requests
Security scanning	Identify vulnerabilities	Entire codebase

These gates ensure that bad code gets caught early, whether it was written by AI or a human having an off day. The maintainers note they "do not differentiate between bad human code and AI vibe code" — both violate their quality standards and get rejected.

The role of experienced developer oversight

Test coverage and automated checks catch many issues, but they can't replace human judgment. Databasus emphasizes that their codebase undergoes "verification by experienced developers with experience in large and secure projects."

This human oversight layer evaluates:

Architecture decisions and long-term maintainability
Security implications that automated tools might miss
Edge cases and failure scenarios
Code readability and documentation quality
Performance characteristics under production load

The project's fast issue resolution and security vulnerability response times demonstrate that this oversight works in practice, not just theory.

Why this approach matters for sensitive projects

Database backup tools sit at a critical point in infrastructure security. They have read access to production databases, handle encryption keys and store backups containing sensitive information. A compromised backup tool could expose an organization's entire data estate.

This security context explains why Databasus takes a conservative approach to AI adoption:

Trust is earned: The tool runs in production environments with access to sensitive data
Failures have consequences: Backup failures or security breaches affect real businesses
Compliance requirements: Many users operate under strict regulatory frameworks
Enterprise adoption: Large organizations scrutinize security practices before deployment

The project's Apache 2.0 license allows anyone to inspect the code, but inspection only helps if the code itself is maintainable and well-tested. AI-generated code without proper verification would undermine this transparency.

Practical lessons for other projects

Databasus provides a template for other security-conscious open source projects considering AI adoption. Here are the key principles they demonstrate:

Document AI usage publicly: Don't leave users guessing whether AI touched security-critical code
Establish clear boundaries: Define what AI can and cannot do in your development process
Maintain quality standards: Apply the same requirements to all code regardless of origin
Require human verification: AI suggestions should inform decisions, not make them
Invest in automated testing: Let computers catch bugs so humans can focus on architecture
Prioritize maintainability: Today's AI shortcut becomes tomorrow's technical debt

These principles balance AI's productivity benefits with the safety requirements of production systems.

The middle path between AI rejection and AI dependency

Some projects reject AI entirely, fearing code quality degradation. Others embrace AI without guardrails, shipping whatever the model generates. Databasus demonstrates a middle path that captures AI benefits while maintaining quality standards.

This approach acknowledges that AI tools are increasingly powerful and can genuinely improve developer productivity when used properly. But it also recognizes that AI models lack the context, judgment and accountability that humans bring to security-critical code.

Approach	Code Quality	Development Speed	Security Risk
No AI usage	Varies by developer	Slower	Depends on practices
Uncontrolled AI	Often poor	Fast initially	High
Databasus approach	High (enforced)	Moderate	Low (verified)

The Databasus approach trades some development speed for higher quality and lower risk. For projects handling sensitive data in production environments, this tradeoff makes sense.

What the open source community can learn

Databasus set an example by publishing their AI usage policy in the main README where every user and contributor can see it. This transparency serves multiple purposes:

User confidence: People deploying the tool know that AI hasn't autonomously generated security-critical code
Contributor guidance: New contributors understand the quality expectations before submitting code
Industry leadership: Other projects can reference this policy when establishing their own AI guidelines
Accountability: Public documentation creates pressure to follow stated practices

The project even explicitly addresses questions about AI usage that arose in issues and discussions, showing they take community concerns seriously.

Conclusion

AI tools are transforming software development, and open source projects must decide how to integrate them responsibly. Databasus demonstrates that you can use AI to enhance developer productivity without compromising code quality or security.

The key insight is treating AI as a junior developer who needs supervision, not as an omniscient coding oracle. This means human verification of all AI suggestions, comprehensive test coverage and consistent quality standards applied to all code regardless of origin.

For projects handling sensitive data in production environments, this careful approach isn't optional — it's a requirement. Databasus showed that responsible AI adoption means documenting your practices publicly, maintaining strict quality gates and never letting AI operate autonomously on security-critical code.

Other open source projects, especially those dealing with security, infrastructure or sensitive data, should study this example. The middle path between AI rejection and AI dependency exists, and it's the right choice for production-grade open source software.

DEV Community