Explainability and transparency in autonomous agents

   ​

 [[{“value”:”

As AI agents gain autonomy, the need for explainability and transparency has never been more urgent. In a recent panel discussion, four AI experts (Keshavan Seshadri, Senior Machine Learning Engineer at Prudential Financial; Pankaj Agrawal, Staff Software Engineer at LinkedIn; Dan Chernoff, Data Scientist at Parallaxis; and Saradha Nagarajan, Senior Data Engineer at Agilent Technologies) came together to explore the stakes of building trust in agentic systems, and the tools, standards, and mindsets needed to make that trust real.

To kick off the discussion, the moderator posed a foundational question:

“Why is explainability and transparency important in these agentic systems?”

Trust and understanding: Saradha Nagarajan on explainability

Saradha Nagarajan, Senior Data Engineer at Agilent Technologies, was quick to emphasize that trust is the core of explainability.

“The trust and adaptability you have in the data or predictions from an agentic AI model is much better when you understand what’s happening behind the scenes,” she said.

Saradha noted that agentic systems need clearly defined ethical guidelines, observability layers, and both pre- and post-deployment auditing mechanisms in order to earn that trust. Transparency is a prerequisite for ethical AI deployment.

Pankaj Agrawal on regulated environments

Pankaj Agrawal, Staff Software Engineer at LinkedIn, added that in regulated industries, transparency is mission-critical.

“Even with agentic AI, you need to ensure the agent has taken the steps it was supposed to,” he explained. “It shouldn’t deviate from the graph it’s meant to follow.”

Pankaj highlighted the need for clear supervisory systems that track agent decisions in real time. The goal? Align every autonomous action with a defined set of ethical and operational guardrails, especially when dealing with sensitive or high-risk applications.

“Explainability plays a huge role in making sure the agent is sticking to its boundaries,” he emphasized.

Ethics vs. governance: Who’s really in charge of AI decisions?

While ethics often dominates the conversation around responsible AI, Dan Chernoff, Data Scientist at Parallaxis, challenged that framing.

“I don’t think it’s necessarily about ethics,” he said. “It’s about governance and how your systems align with the rules that apply in your environment.”

Dan acknowledged that ethics does play a role, but emphasized the organizational responsibility to comply with governance policies around PII, sensitive data, and auditing. If a model leaks data or behaves in a biased way, companies must be able to:

Trace decisions back to the data or model inputsUnderstand how those decisions were madeIdentify whether multi-agent systems contributed to the error

In short, agentic systems must be observable, not just explainable, with clear accountability for both outcomes and contributors.

Keshavan Seshadri on regulatory alignment

Keshavan Seshadri, Senior Machine Learning Engineer at Prudential Financial, brought in a global perspective, highlighting how the EU AI Act is shaping risk thinking across the industry.

“Europe has always been the front-runner on regulation,” he said. “The EU AI Act tells us what counts as acceptable risk, low risk, high risk, and what’s completely unacceptable.”

For AI system designers, this means mapping agent decisions to risk levels and designing accordingly. If the team understands what decisions the agent is making and where the risks lie, then they can proactively:

Identify model biasSpot areas of high uncertaintyBuild safer, more robust systems from the ground up

The future of IoT is agentic and autonomous
Agentic AI enables autonomous, goal-driven decision-making across the IoT, transforming smart homes, cities, and industrial systems.

Aligning stakeholders for governance and security

At this point, the moderator steered the conversation toward organizational alignment:

“As you talk about governance and cybersecurity, it has these big tentacles that reach broadly in the organization. How do you think about getting the right people to the table?”

This prompted the panel to move from technical considerations to structural and cultural ones; a shift toward cross-functional responsibility for responsible AI implementation.

Why collaboration matters in AI development

As the discussion moved from governance to execution, the panelists emphasized a critical but often overlooked reality: building responsible AI requires a coalition, not a solo act.

Dan Chernoff, Data Scientist at Parallaxis, framed it in familiar terms:

“As a data scientist, we always start with: what’s the business value we’re trying to gain? That defines who needs to be involved.”

Dan explained that identifying the question of interest should naturally pull in product leaders, customers, security teams, and other stakeholders. It’s not enough to have data scientists building in isolation; responsible AI must be a shared initiative across the business.

“It has to be a coalition of people,” he said. “Not just to define what we’re building, but to ensure it helps both the customer and the business, and that it’s safe and observable.”

LinkedIn’s collaborative approach

Pankaj Agrawal, Staff Software Engineer at LinkedIn, offered a concrete example of how his team puts this principle into practice.

“We created a playground for business users to play with prompts,” he said. “That way, they can see what the model produces and what its limitations are.”

By giving non-technical stakeholders a hands-on way to interact with models early on, LinkedIn ensures that expectations are grounded, capabilities are better understood, and collaboration starts from a place of shared understanding.

From there, Pankaj’s team brings in the necessary players, especially InfoSec and legal/compliance teams, to validate guardrails and secure greenlights for deployment.

“You need to engage InfoSec and all the regulated areas to make sure everything is smooth before moving forward,” he added.

Navigating regulated environments: Risk, guardrails, and monitoring

The moderator next posed a critical question for teams in high-stakes industries:

“For those of you in a regulated space, how do you think about the challenges these agents present?”

Pankaj Agrawal of LinkedIn responded first, pointing to a core risk already raised earlier in the conversation: data leakage and prompt injection.

“We’ve seen agents tricked into revealing how to hack the system,” Pankaj said. “In regulated environments, you cannot afford that.”

To mitigate these risks, his team prioritizes:

Sanitizing user inputWriting precise and purpose-limited system promptsMaintaining detailed agent traces to monitor for driftEnsuring agents consistently operate within predefined safe zones

“Monitor accuracy, completion, cost – all of it,” he added. “This needs to be built into your observability stack.”

Domain-specific guardrails: One size doesn’t fit all

Saradha Nagarajan of Agilent Technologies emphasized that guardrails should be tailored to the context.

“If you’re solving a problem in healthcare, which is high-risk and highly regulated, your guardrails have to reflect the domain-specific needs,” she said.

That doesn’t mean general-purpose systems are off the table, but even in domain-agnostic scenarios, baseline protections are still essential.

“Even in a case like ChatGPT,” she added, “what kind of controls are in place when the agent responds to a jailbreak attempt?”

This is where semantic analysis, automated filters, and governance-aligned automation become essential, not just during training or system prompt development, but in real-time during agent execution.

Governance must be operationalized

Keshavan Seshadri of Prudential Financial tied it all together with a reminder: governance has to be enforced in software.

“You need to define what controls are required by your industry and automate them,” he said.

From semantic validation to use-case-level oversight, agentic systems need embedded governance that functions at runtime, before any output reaches the customer.

Emergent behaviors in multi-agent systems

As agentic AI becomes more autonomous and more distributed, new risks emerge. Saradha Nagarajan cautioned that multi-agent systems introduce another layer of unpredictability.

“When agents are interacting with each other, you can get outputs that were never anticipated,” she said. “That’s the danger of emergent behavior.”

These aren’t just edge cases. In highly dynamic environments, agents may:

Make assumptions based on incomplete dataAmplify each other’s errorsDrift from original task parameters in unexpected but logical ways

This raises a key question: What happens when agents go off-script?

Saradha emphasized the need for structural guardrails to keep these systems within tolerances, even when they operate with relative autonomy.

Preventing data leaks with “least privilege” tool design

To prevent data leakage, Pankaj Agrawal offered a simple but powerful piece of advice:

“Follow the 101 of software principles: least privilege.”

In agentic systems, the tools and functions that agents call need access controls. By restricting what agents can do, teams can limit the blast radius of failure.

“Don’t let tools expose things they shouldn’t. You’ll save a ton of pain later.”

Dan Chernoff added a practical lens to this: always ask yourself how a mistake might look in the real world.

“I tend to think of it through the lens of a headline,” he said. “How would what I’m doing look on the front page of a newspaper?”

“Agentic AI is here — are you ready?” With Ash Dhupar
Ash Dhupar joins us to unpack the real-world challenges of bringing AI—especially agentic AI—into business at scale.

Multimodal models: More power, more complexity

As agentic AI expands to include multimodal inputs and outputs, text, image, audio, and video, explainability becomes even more complex.

Saradha Nagarajan explained the challenge succinctly:

“Whether it’s a positive or negative outcome, it becomes difficult to pinpoint which feature or which agent led to that result.”

That lack of traceability makes debugging and performance optimization far harder. It’s not impossible, but it introduces significant computational overhead.

To strike a balance, Saradha suggested hybrid design patterns: use complex models for reasoning where necessary, but don’t be afraid to fall back on simpler, rule-based systems when transparency matters more than sophistication.

“We need a balancing act; the setup has to be transparent, even if it means simplifying parts of the system.”

Designing for context: Reactive vs. deliberate systems

Keshavan Seshadri expanded on this idea, using a deceptively simple example:

“If I ask in this room, ‘What is one plus one?’ – the answer is two. But in finance, one plus one could be three… or zero.”

The context matters. Some questions are best handled with reactive systems; quick-response models that return immediate answers. Others demand deliberate systems, where agents reason through tools, context, and prior steps.

“It’s about designing a hybrid system,” he said. “One that knows when to be reactive, when to reason, and how to call the right tools for the task.”

Don’t forget the audit trail

Dan Chernoff offered a final, practical reminder: no matter how complex or clever your system gets, you need to keep a record.

“In the multimodal space, make sure the information is captured,” he said. “You need an audit trail, because when questions come up, and they will, you need a way to trace what happened.”

Prompting, supervisory agents, and the need for observability

As the panel turned to prompting strategies, the complexity of agentic systems came back into sharp focus, particularly in multi-agent setups where tasks are passed from one agent to another.

Dan Chernoff opened the discussion by outlining a common but powerful pattern: the supervisory agent.

“You have a single model that farms out a plan and follows it through a set of tools or agents,” he explained. “The challenge is designing the system prompt for that supervisor and the guardrails for each downstream tool.”

Things get especially tricky when unexpected responses come back from those tools. For instance, if the supervisory agent queries a database expecting a number, but gets back the word “chickens,” it needs to know how to respond or, at the very least, flag the error for review.

“We haven’t really created guardrails for when the system hits something it can’t interpret,” Dan noted. “That’s where observability becomes critical, so we can trap those issues and evolve the system accordingly.”

Evaluation and shared memory: Auditing the whole system

Pankaj Agrawal emphasized that multi-agent systems are rarely linear. Routing agents often make dynamic decisions based on prior outputs, passing tasks between tools in real time.

“It’s not just a one-way flow. The routing agent might use agent X, then based on output, call agent Y, and you have to eval the whole chain.”

That means not only evaluating outputs against reference data, but also observing and validating how context is retained and passed.

Saradha Nagarajan added that shared memory systems, like knowledge graphs, must also be part of the evaluation process.

“We need to think about context retention across observability learning, reinforcement learning, or even plain LLM learning.”

Keshavan Seshadri expanded on this further: sometimes, the best way to ensure traceability is to add an external agent whose sole job is to evaluate the rest of the system.

“You can have an agent that audits everything, from input to prompts and to responses, creating a rich audit trail.”

The conversation closed on a practical note: the art of prompt engineering is a team sport.

“A great prompt is incredibly valuable,” said Chernoff. “And often, it takes multiple disciplines to craft it.”

That means upskilling teams, combining technical and business expertise, and treating prompt design as part of the broader strategy for building explainable, transparent AI systems.

Domain-specific design

As the conversation shifted toward domain-specific applications, the panelists emphasized how context changes everything when deploying AI systems.

Keshavan Seshadri pointed out that user experience and trust hinge on tailoring both the input and output phases of AI systems to the domain in question.

“Whether it’s customer-facing or internal, the system must reflect the policies and constraints of the domain,” he said. “That’s what makes it feel trustworthy and usable.”

In highly regulated sectors like healthcare, finance, or autonomous driving, that trust is a compliance necessity.

Saradha Nagarajan illustrated this with a vivid example from autonomous vehicles:

“If your Tesla suddenly takes a left turn, you want to know why. Was it in full self-driving mode? Was it just mapping nearby vehicles? The explainability of that action depends entirely on what the system was designed to do and how well you’ve communicated that.”

The key takeaway: domain-specific design isn’t just about tuning prompts. It’s about clarifying what role the agent is playing, what decisions it’s allowed to make, and how those decisions are logged, constrained, and justified within the domain’s risk tolerance.

How Agentic AI is transforming healthcare delivery
Agentic AI is transforming healthcare by enhancing workflows, cutting costs, and enabling proactive, whole-person patient care at scale.

Why domain-specific agents matter

The panel unanimously agreed: domain-specific agents offer major strategic advantages, both from a data quality perspective and in terms of performance and governance.

Pankaj Agrawal noted that by narrowing the scope of an agent to a specific vertical, teams gain tighter control over the system’s behavior:

“You have access to domain-specific data. That means you can fine-tune your agent, craft precise system prompts, and enforce guardrails that actually make sense for the domain.”

He also highlighted the growing industry shift toward expert agent architectures; smaller, specialized models or sub-systems that focus on tightly scoped tasks, reducing latency and improving output fidelity.

Building on this, Dan Chernoff emphasized the role of subject matter experts (SMEs) in agent design.

“It’s not just data scientists or engineers anymore. You need legal, compliance, privacy, and domain experts in the loop, from designing prompts to evaluating edge cases, especially when you’re working across domains.”

The conversation touched on the tension between general-purpose models and specialized, vertical solutions. While foundational models are built for broad use, enterprise problems are often narrow and deep.

Saradha Nagarajan summed it up well:

“There’s this push from the vendors to go wide. But in regulated or high-risk industries, we need to go deep. That’s where domain specificity becomes non-negotiable.”

In short, successful agentic AI in the enterprise is about aligning data, expertise, and oversight around focused, well-scoped agents.

Techniques for transparent AI

As the panel discussion turned to practical techniques, the focus shifted toward how transparency can be built into agentic systems – not just layered on after the fact.

Saradha Nagarajan outlined two key strategies for improving explainability in AI:

“You can either collect detailed audit trails and track key performance indicators over time, or apply post hoc interpretability methods to reverse-engineer model outputs. Both approaches help, but they serve different needs.”

Post hoc techniques, she explained, involve analyzing past outputs and manipulating input variables in controlled ways to understand how the model arrived at a decision. This works well for complex models that weren’t built with explainability in mind.

But increasingly, the shift is toward designing transparent systems from the start.

Pankaj Agrawal framed the issue around a useful metaphor:

“It’s about whether your system is a black box, where you can’t see inside, or a glass box, where the internal decision-making is fully visible and traceable.”

While black-box approaches dominated early machine learning systems, the industry is now moving toward inherently interpretable architectures, including rule-based systems, decision trees, and modular agentic workflows.

“This doesn’t just support transparency,” Pankaj added, “it also helps with debugging. When something goes wrong, you want to know exactly which agent or module made which call and why.”

The takeaway? Post hoc tools like SHAP and LIME still have value, but future-forward AI systems are increasingly built with explainability as a core design principle, not an afterthought.

The shift toward transparent, auditable AI systems

The panel closed with a shared recognition: transparency must be foundational, not optional, in agentic AI systems.

Pankaj Agrawal highlighted the importance of understanding which tools agents invoke in response to a prompt:

“As a developer or system designer, I need to know what tools are called. Should a calculator be used for a simple question like one plus one? Absolutely not. But if it is, I want to see that, and understand why.”

This kind of tool-level traceability is only possible in well-instrumented systems designed with observability in mind.

Dan Chernoff built on this point by stressing the architectural implications:

“Agentic AI is evolving fast. You’ve got supervisory models, multimodal chains, and classification-first approaches, all based on the latest papers. But the principle remains: start small, start with an end in mind, and wrap everything in logs and observability.”

Whether you’re working with predictive models, generative LLMs, or multi-agent chains, techniques like Chain-of-Thought, Tree-of-Reasoning, SHAP, and LIME all contribute to explainability, but only if your system is auditable from the start.

“Whitepaper-driven development is a thing,” Dan joked, “but the key is in building ones you can debug, understand, and trust.”

Adding determinism to non-deterministic agentic systems

The conversation wrapped with a critical consideration in productionizing AI: how do we make non-deterministic systems reliable?

Pankaj Agrawal pointed out that while traditional software is deterministic, agentic AI systems powered by LLMs are not, and that means we must reframe how we think about quality and consistency.

“Models will change. Even a slight tweak in a prompt can yield a different output. So instead of over-optimizing for the perfect prompt, the key differentiator now is evaluations.”

He emphasized the growing trend among AI startups: rather than viewing the prompt as a proprietary asset, teams are shifting their focus to rigorous evaluation frameworks, often their real USP, that keep models grounded in truth and business requirements.

“You need a solid eval set to ground your agents, especially when LLMs, prompts, or other variables change underneath you.”

Evaluation frameworks like LangSmith (mentioned by name) help teams implement structured testing setups, define ground truths, and track consistency over time. This adds a layer of determinism, or at least verifiability, to inherently fluid systems.

“Evals are what will stick with you. Models will evolve, but well-designed evals help you ensure your system performs reliably even as the landscape shifts.”

Human-AI collaboration in decision making

As the session came to a close, panelists emphasized that today’s agentic AI systems are not fully autonomous, and in many cases, they’re not meant to be.

Saradha Nagarajan noted that these systems are best viewed as assistive, not autonomous. The human still defines the prompt, sets the evaluation criteria, and ultimately decides whether the output is usable.

“Think of it like a chatbot for finance. The agent might sift through your financial documents and answer questions like ‘What were earnings in Q1 2024?’, but a human is still in the loop, making the judgment call.”

The future of agentic AI, especially in high-stakes domains like finance and healthcare, will hinge on human-AI collaboration, not blind delegation.

“}]] 

AI experts share strategies for transparency, governance, and trust in agentic systems, plus techniques for building explainable, auditable models. 

Related Posts

Recent Events

Scroll to Top