Agentic AI for Automated Application Security and Vulnerability Management
Explore how agentic AI enables several agents to jointly detect, fix, and verify security vulnerabilities in your codebase.
Join the DZone community and get the full member experience.
Join For FreeIt was not so long ago that I was having a much closer look at how AI is becoming embedded in our everyday developer work. I have watched more intelligent code suggestions, automated testing routines, and those ubiquitous chatbots become a normal part of the everyday toolkit. They are useful, naturally, but at their core, they are still fundamentally reactive, they still wait for you to inquire of them before taking action.
What has really blown me away recently is the emergence of agentic AI. It is completely different, it does not just sit around and wait to be told what to do, but instead it takes charge itself. It can decide what should be done, strategize the steps, and even adjust if something unexpected happens. It's sort of like having a colleague who is always planning ahead, as opposed to merely carrying out orders.
Curious about the real-world impact of agentic AI, I set up a simple experiment: I showed an AI agent some code with a well-known security vulnerability. Instead of just flagging the issue, the AI took the action. It refactored the code, added error handling, and even left a comment with further steps. There are no step-by-step instructions necessary. The AI simply understood what needed to be done and did it on its own.
That is the real promise of agentic AI. It is not another tool you have to monitor, but rather an active partner that knows the bigger picture and covers for what you might miss. In this article, I will go beyond theory and take agentic AI for a spin in a real use case: auto-detecting and fixing security vulnerabilities in your codebase, so issues like SQL injection do not fall through the cracks.
An Experiment to Test Agentic AI's Capabilities
Below are the steps I took to see if agentic AI would identify the security vulnerability in the code.
Here is the sample code with a SQL Injection vulnerability:
def login(username, password):
query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
return db.execute(query)
Set Up Your AI Teammate:
- Ensure Python 3.12 or higher is installed.
- Download and install Ollama, as this helps us run everything from your local system.
- Pull codellama model, a model that understands the code.
- Install pyautogen and other dependencies
pip install pyautogen groq python-dotenv
- Setup GroqCloud
-
Sign up at GroqCloud Console and create an API key under "API Keys"
-
- Create an .env file and add the API key into it:
GROQ_CLOUD_API_KEY=your_api_key_here
Now, let's define four specialized agents that collaborate like a security team. Each one has its role but collaborate with each other.
- VulnerabilityScanner: This agent scans code for vulnerabilities such as SQL Injection and XSS
- RiskPrioritizer: This agent prioritizes vulnerabilities by severity(CVSS scores)
- CodeFixer: This agent rewrites code to remediate vulnerabilities without breaking functionality
- Validator:This agent does the final validation of the fixes
For this example, I will be using llama3-70b-8192
model for all agents. In the real world, you would use different models for different agents.
Implementation:
from autogen import AssistantAgent, GroupChat, GroupChatManager
import os
from dotenv import load_dotenv
load_dotenv()
# This model configuration will be reused for all agents.
groq_cloud_configuration = {
"config_list": [{"model": "llama3-70b-8192",
"api_key": os.getenv("GROQ_CLOUD_API_KEY"),
"base_url": "https://api.groq.com/openai/v1"}]
}
# Agent 1: Vulnerability Scanner
vulnerability_scanner_agent = AssistantAgent(name="VulnerabilityScanner",
system_message="Find security vulnerabilities in code",
llm_config=groq_cloud_configuration
)
# Agent 2: Risk Prioritizer
risk_prioritizer_agent = AssistantAgent(name="RiskPrioritizer",
system_message="Prioritize vulnerability based on CVSS score",
llm_config=groq_cloud_configuration
)
# Agent 3: Code Fixer
code_fixer_agent = AssistantAgent(name="CodeFixer",
system_message="Generate secure code fixes",
llm_config=groq_cloud_configuration
)
# Agent 4: Validator
validator_agent = AssistantAgent(name="Validator",
system_message="Test fixes for functionality and security",
llm_config=groq_cloud_configuration
)
Now, list all your agents in the order that you want them to communicate and initiate group chat:
# creating group chat object with maximum round as 15
group_chat=GroupChat(agents=agents,messages=[],max_round=15)
# Creating GroupChatManager with the same llm config
group_chat_manager=GroupChatManager(groupchat=group_chat,llm_config=groq_cloud_config)
Test it with the example SQL injection vulnerability mentioned above:
vulnerable_sql_injection_code = """
def login(username, password):
query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
return db.execute(query)
"""
# This will initiate the workflow
final_result=vulnerability_scanner_agent.initiate_chat(group_chat_manager,message=f"Review and fix:\n{vulnerable_sql_injection_code}")
print(final_result.summary)
When you run this multi-agent workflow, you will notice a sequence of long, wordy outputs, each one starts with the label "Next speaker: <agent name>". This is because each agent in the system receives a turn, just as individuals in a conversation do, with each building on the previous agent's work. First, the VulnerabilityScanner agent looks at the code and determines the security vulnerability. Then, the RiskPrioritizer examines how severe the problem is and explains why it matters. Then, the CodeFixer comes along and securely rewrites the code and typically adds more best practices. Finally, the Validator agent tests the fixes, makes sure everything still works, and offers even more recommendations to harden your application. Every "speaker" hands off information to the next, so the agents actually communicate, work together, and iterate on the solution in tandem, which is a lot like a team of experts passing off a case, with each one contributing their expertise. This back-and-forth process is what gives agentic AI its strength because it does not just identify issues, it also explains, resolves, and optimizes your code via a collaborative and transparent process.
Conclusion
This is only a single demonstration of what is possible with agentic AI and multi-agent workflows. While we experimented it for security flaw detection and fixing, the same process can be used to automate any type of difficult activity in software development and many other areas, ranging from code audit and compliance reporting, to performance optimizing and incident handling. I trust that this information has given you a fair idea of what agentic AI is really like and encourages you to go ahead and experiment with your own multi-agent systems.
Opinions expressed by DZone contributors are their own.
Comments