Building AI People Can Trust

Imagine this: your sales team is thrilled about a new AI assistant. Then someone asks—what if the bot accidentally exposes our pricing data or internal manuals?

That simple question opens the floodgates:

Are we allowed to record our sales reps for training data? Where does customer PII go? Do vendors really keep our data private? What if the bot gives unsafe product advice?

There are frameworks (NIST AI RMF, OWASP, FTC/CCPA/GDPR) and key practices (data classification, enterprise tiers, vendor diligence, clear disclosure) that can help. It’s on us, the humans, to build responsibly.

The question isn’t “Should we trust AI?” but “How do we build AI people can trust?”

Prototype to Production

I built a working AI prototype that could process PDFs perfectly. Then I tried to move it into production.

That’s when I discovered OpenAI’s API handles PDF processing completely differently than ChatGPT does. The prototype worked great in the interface, but the API? Not so much.

Turns out PDF handling was only added to OpenAI’s API last month – and it’s still pretty limited on file sizes. My prototype hit those constraints immediately.

I pivoted to Claude’s API, which handles PDFs up to around 30MB. Problem solved, but it got me thinking about a bigger issue – always validate API capabilities early, even when the UI version works flawlessly. What works in conversation doesn’t always translate to production code.

This isn’t the first time I’ve hit this kind of mismatch. The gap between demo and deployment can be surprisingly wide.

Two Types of “AI Agents” – And Why the Distinction Matters

Working on AI implementations, I keep running into confusion around the term “AI agent.” Turns out we’re talking about two completely different things.

Type 1: Autonomous AI Agents These are the systems getting all the buzz. An AI agent can perceive its environment, decide which tools to use, and execute actions without constant hand-holding. Think customer service bots that access your CRM, check inventory, process returns, and escalate issues – all while maintaining context and making smart decisions.

Type 2: AI-Enhanced Workflows
This is AI plugged into traditional automation platforms like Zapier, Make, Power Automate, ServiceNow, or custom solutions. The AI handles specific tasks within a larger, predictable process flow.

Real example I’m building: Staff scan shipping labels with a mobile app. AI extracts supplier info, model numbers, delivery dates, and populates our equipment database. Standard workflow automation then triggers notifications to procurement, project managers, and finance.

But here’s where it gets interesting: The system also compares delivery timelines against project schedules. When procurement suggests equipment substitutions for cost savings, AI evaluates whether the new supplier’s lead times will mess up critical milestones. If there’s a conflict, it sends up an alert that can be acted upon.

The key difference: Workflows excel at consistent, repeatable processes. Autonomous agents shine when you need adaptive decision-making across multiple variables.

The most powerful implementations combine both – workflow automation for operational consistency, enhanced with AI agents for complex decisions.

In leveraging AI for business operations, getting this distinction right can save serious time and headaches during deployment.

What are you seeing out there? Are you building agents or workflows?

Stuck On Deployment

I built a nifty AI utility tool for a client that will look into a google folder and sub folders, ingest all documents, then extract names & titles of all the people it finds inside. It’s a quick way for the sales team to comb through historical contracts, SOWs, project plans, etc and find the people we’ve worked with in the past who could become new contacts – even if they’re at new companies.

Scripting the AI prompt took a couple of hours. I needed a tight prompt so the end user doesn’t have to interact with the script yet still receives a tidy output list every time. That turned out to be the easy part.

For this project, the challenge (for me) was the deployment, especially as I learned more about Google’s ecosystem. Linking Apps Scripts to a GCP project, enabling Google Drive API in Cloud Console and adding it to the Apps Script, authorizing the script, etc. The deployment took many more hours than the AI piece. I felt frustrated – right on the verge of having a useful tool, but stuck in the details of deployment.

I eventually set it aside for the night and came back the next morning. I asked AI to create a checklist of EVERY SINGLE detail necessary for setup and deployment. That cracked the case and got me across the finish line.

Using AI to solve the deployment issue was pretty nifty as well.

Thoughts on AI Prompt Engineering

Prompt engineering is an art form – and it’s already a legitimate career path, even if many companies haven’t caught up yet.

As LLMs get more powerful and can handle longer reasoning sessions (we’re talking 10+ minute processing times now), a well-crafted prompt becomes the difference between impressive demos and reliable, production-ready automation.

Sure, anyone can get cool results from conversational agents. But building prompts that deliver consistent, predictable outcomes for business-critical tasks? That requires genuine skill, experience, and strategic thinking.

I’ve seen teams spend multiple hours perfecting a single prompt – and save hundreds of hours downstream. Every word matters. Every sequence matters.

My approach? Treat prompt writing like crafting a compelling essay. Structure, flow, and precision all count.

Here are three game-changing techniques I’ve learned:

Examples are gold. Sometimes showing beats telling by a mile – even for AI. One solid example can communicate what paragraphs of instructions can’t.

Order is everything. The sequence of your instructions dramatically impacts results. Pro tip: put your most critical requirements at the end – that’s what the model “remembers” best.

Test relentlessly. Great prompts emerge through iteration, not inspiration. Build, test, refine, repeat.

There are fantastic tutorials out there (easy to find, though mastery takes practice), and tools like Promptmetheus or Originality can accelerate your workflow. But I’d recommend starting with manual practice first – understanding the fundamentals makes you a better prompt engineer long-term.

How’s your prompt engineering journey going? Are you seeing it become more important in your work too?

Framework for Finding ROI from AI

The highest value AI agents often aren’t the flashiest ones, but rather those that eliminate friction in existing processes.

I’ve had clients come to the table with ideas of things to build, but 50% of the time it’s not the most valuable agent for their business. How do you find the highest ROI?

My framework: MAP → IDENTIFY → PRIORITIZE → BUILD

Map: Start by building a customer journey map on Figma. Get the high-level view first, then drill into the details.

Identify: Look for repetitive, time-consuming steps with heavy text or voice components. With voice agents expanding rapidly, audio touchpoints are prime opportunities.

Prioritize: Focus on friction points that impact the most customers or consume the most resources.

Build: The cost to run an AI agent is negligible compared to development cost, so start with your highest-impact opportunity.

Real example: A SaaS company wanted a complex lead scoring agent. But mapping their journey revealed the real bottleneck was customer onboarding. A simple FAQ agent reduced their support tickets by 40% and freed up their team to focus on strategic accounts.

By starting with the customer journey instead of the technology, you’re more likely to land on solutions that drive real value. The opportunities we find this way are usually easier to build AND deliver higher returns.

It Ain’t Sexy

It ain’t sexy, but it’s practical. I asked AI to write a bit of code to ingest a data file, parse and post the data to another database. The parsing rules are complex with many exceptions, which is why we’ve been doing it manually for years with an admin person.

A few hours to carefully construct the prompt, then maybe 5-6 hours testing and debugging; now it’s automated and saving up to 5 hours/week!

AI enabled this. Creating space for the human to do more high-value thinking.