I’ve been lukewarm on ChatGPT and generative AI. I saw the potential but predicted (erroneously) that we were still far removed from it being generally useful and reliable. Those ideas were formed through experience, incorrect comparisons and a lack of imagination on my part.
I wanted to write this post to give context to my skepticism and provide examples (with prompts) on generative AI’s usefulness. When I read articles on the web I see these very vague stories and trivial use cases that don’t inspire me. (Except meeting note transcribers, which in my experience have been universally amazing) I hope someone finds this useful and inspiring.
A little history
I’d used MidJourney, an image generation tool, for a bit to create images for my D&D game. MidJourney generally performed well if I kept the prompts basic. But the tool (at the time) had some odd side effects. For example, it was horrible at generating images of hands, so anything that required hands, like a wizard casting a spell, was a hot mess. It also didn’t do a good job of interpreting my intent and translating that into a sensible image. Minutes before writing this blog post I used the following prompt with MidJourney.
a centaur in the forrest hunting a deer with a bow and arrow
It generated the image below

Not exactly instilling confidence in me. This helped to shape my perception of what AI was capable of.
When it comes to incorrect comparisons I naively created a mental model where AI was just the next iteration of our smart assistant technologies like Siri, Alexa and Google Assistant. Those tools are great for keeping shopping lists but can be infuriating with some of the most basic tasks. This subconsciously created a mental ceiling on my expectations of what ChatGPT and other GenAI tools could do for me.
With these two things combined, my lack of imagination of what to even attempt to use ChatGPT for was in full-effect. I’d dismiss use cases without even attempting them, the way you would dismiss your 8 year old son as a viable helper to move a piano. Anything but the mundane just wasn’t possible.
Getting my feet wet
As the Internet started buzzing more and more about ChatGPT, I would hear a lot of chatter about how awesome it was but without any tangible time-saving use cases for me. I did begin to use it to do things like generate location names for my D&D game and the results were helpful!
Generate a list of 20 names of a cult organization in a fantasy setting that is tasked with preventing demons from entering the world
That prompt responded with a list (cut down for brevity)
1. The Veilwardens
2. Order of the Abyssal Seal
3. Guardians of the Broken Gate
4. The Demonveil Covenant
5. Wardens of the Unseen Lock
6. The Chainbound Order
7. The Seraphic Vanguard
8. Watchers of the Nethergate
9. The Crimson Sigil
10. The Sanctum of Eternal Vigil
Not bad, but again, a cheap parlor trick in my eyes.
Can ChatGPT help me professionally though?
As the generative AI fervor continued to heat up, I feared becoming a dinosaur. That guy in the office that does things “the old school way” and has the young folk running circles around him. (I already feel that way about Email vs Slack/Teams) I decided to start looking more seriously at ChatGPT.
This week I made a LinkedIn Post asking how people were using AI in their role as managers. It didn’t get a ton of traction but a few articles posted got me thinking of possibilities.
I’ve been working out some performance metrics for my team and I’d been thinking about using a Z-score to create a comparable value for story points completed across people on different teams. (Different teams might have different pointing methods so a direct comparison isn’t fruitful) I wasn’t 100% confident that statistically speaking, what I was trying to do was fair. I asked my wife, who is stronger in statistics than I and she pondered it for a bit before saying “Ask ChatGPT”. It sounded ludicrous to me but what did I have to lose.
Can I use the z-score to compare developer stories per sprint for teams that use different story pointing methods
Yes, you can use the z-score to compare developer stories per sprint across teams with different story-pointing methods, but you need to be cautious about how you interpret the results.
Why the Z-Score Could Work
- Standardization – The z-score helps normalize the number of developer stories completed per sprint within each team. This allows you to compare how far each sprint’s performance deviates from that team’s own average, regardless of absolute values.
- Relative Performance – Since different teams may use different story-pointing methods (e.g., Fibonacci vs. T-shirt sizing vs. linear scales), the absolute values might not be directly comparable, but their distributions within each team can be.
Potential Pitfalls
- Different Definitions of “Story” – Some teams might split work into smaller or larger stories, which means a higher count of stories per sprint doesn’t necessarily mean more work was done.
- Variance Differences – If one team has a much more volatile sprint performance (high standard deviation) compared to another, z-score comparisons might exaggerate or downplay certain trends.
- Context Matters – Even with a z-score, a high number of completed stories might not indicate better productivity—it could just mean different estimation practices.
How to Use It Effectively
• Compute z-scores within each team for a fair comparison.
• Look at trends rather than absolute z-score values.
• Complement with other metrics like velocity trends, cycle time, or throughput to get a more holistic view.
It answered my question and my concerns almost perfectly. The one mistake, and this is a prompting mistake, is that it assumed I was using stories completed versus story points completed. But the advice is still sound. Now my creative juices are flowing.
I’ve wanted to do spend analysis on a few of our contracts. I’ve been avoiding it because getting the contract data into a workable format is tedious. Previously I’d dismissed ChatGPT as a viable solution, “it would be more effort than just entering it myself” was my unspoken stance. But now I had the audacity of hope.
I picked a vendor that had some quirks in terms of how the data was structured. A basic line item on the contract consisted of
- Service
- Quantity
- List price
- Sale price
It was also formatted in sections as opposed to all the line items being one after another. The other complication is that the list and sale prices were grouped in bundles, so the sale price would read “$10 per 10k executions” for example. I was just going to throw it at ChatGPT and see what I got.
I uploaded 4 years worth of annual invoices (after upgrading my ChatGPT subscription to the pro plan) and gave the following prompt.
Process the attached PDFs and convert the Committed Services section to a CSV file. The columns should be Service/Feature, Quantity, List Price and Sales Price. Add an additional column named “Contract Year” and fill that columns value with the year taken from the Start Date. Add another column that calculates the total price by multiplying the sales price and the quantity and then multiplying that value by 12
Given my experiences, I thought this was a bold ask. ChatGPT performed flawlessly. A few things that impressed me.
- It was smart enough to understand the “$10 per 10k executions” and convert that to a per execution value for the sake of doing the math.
- The response it gave back indicated that it understood my intent with the math. Part of the confirmation response was “Total Price (calculated as Quantity × Sales Price × 12 to reflect the annual cost)” It was smart enough to infer that the billing details were monthly and I wanted an annual total.
I downloaded my CSV, checked the totals against the invoices and they were spot on. I launched Excel, ready to import my new data and get to analyzing when I realized that my lack of imagination was still getting in my way. I went back to the ChatGPT prompt.
Analyze the attached CSV and describe for me any trends that you find. Also tell me what is the biggest driver of my spend year over year
It came back with a table (which I can’t share) that listed my annual spend per contract year, another table with my top line item spend per year and then a narrative about that year with inferred reasons for that spend. (“X became the highest cost driver, showing a move towards Y”) It recognized that the increase in specific line item categories showed an overall shift of investment and focus in those areas. It also performed a linear regression-based forecast on my top spending line items for the next three years. It also identified areas of negative growth in terms of usage/adoption and suggested identifying what was driving that trend and if it’s expected, to accelerate it to reduce costs.
To put a bow on it I asked ChatGPT one more question. I personally have estimated a total spend on the annual contract value where once we cross it, we should begin evaluating bringing that work in-house. (Eventually, it can become cheaper to run things yourself once you’ve hit a certain scale) I asked ChatGPT when it projected I would hit this number and it projected late 2027 with an estimated annual dollar amount. This was all done in under 15 minutes.
Next Steps
I’m now convinced that ChatGPT needs to be part of my toolbox. Instead of wasting 30 minutes of my time and then moving to ChatGPT for some tasks, I’ll start there, uninhibited by expectations, and see what magic I can wrought out.
Of course ChatGPT still has its deficiencies but we shouldn’t assume those are the norm. I asked it to generate a slide for me and regardless of the efficacy of my prompt, ChatGPT still doesn’t know how to spell words in images.

Nothing is perfect. But it can still be useful!