Catalyst with Shayle Kann - AI for climate: a real world test
Episode Date: June 15, 2023The list of potential uses for AI in climatetech is growing fast: developing better materials, optimizing solar farms, integrating renewables and microgrids. But many of these are still theoretical. W...e wanted to find a real-world application that changed the way we make climatetech. So we decided to come up with our own test run. Back in March Duncan Campbell, vice president at Scale Microgrids, used ChatGPT to code some battery dispatch software and tweeted about his experience. Duncan isn’t a professional software developer, but he still came up with some promising results. Could a non-coder like Duncan use AI to do the work of several climatetech coders? We invited Duncan to do it again and ramped up the challenge. We recruited Seyed Madaeni, CEO and co-founder of Verse to create a challenge for Duncan. Seyed is an expert in AI and the software used in electricity markets. He routinely sends “problem statements” to his team of software developers to create new software. This time, he sent a problem statement to Duncan that reflects real world conditions, one that we might actually assign to real engineers to solve. The challenge? Develop battery dispatch software using ChatGPT. In this episode, Duncan presents his results to Shayle and Seyed. They talk about things like: The different methods of optimizing battery dispatch, from old-school Excel sheets to more sophisticated software written by coders Seyed’s process of assigning a problem statement to his engineering team and the simplified version he sent to Duncan Duncan’s process of iteratively working with ChatGPT-4 to develop and debug the code Why working with ChatGPT is like working with a bunch of really fast, but really inexperienced junior coders If you want to see the code that Duncan wrote with ChatGPT, click here. Watch the conversation on YouTube. Recommended Resources: Carbon Copy Live: How AI could supercharge climatetech The Wall Street Journal: Why AI Is the Next Big Bet for Climate Tech Catalyst is a co-production of Post Script Media and Canary Media. Support for Catalyst comes from Climate Positive, a podcast by HASI, that features candid conversations with the leaders, innovators, and changemakers who are at the forefront of the transition to a sustainable economy. Listen and subscribe wherever you get your podcasts. Catalyst is supported by Scale Microgrids, the distributed energy company dedicated to transforming the way modern energy infrastructure is designed, constructed, and financed. Distributed generation can be complex. Scale makes it easy. Learn more: scalemicrogrids.com.
Transcript
Discussion (0)
from the studios of PostScript Media and Canary Media.
I'm Shale Khan, and this is Catalyst.
I just noticed as you were describing what you were doing,
you're saying, we got it working for multiple days.
We did this, we did that.
I think when you say we, you're referring to you and chat TPT.
Is that right?
Did you feel like you were partnered with the model building this together?
Oh yeah, for sure.
It felt very similar to like trading Slack,
messages back and forth with like an analyst on my team. Now, it wasn't as fun, right? But like it,
you know, but it's the person who's working on this with me. You know what's boring? Talking about
how AI will disrupt energy. You know what's not boring? Actually doing it. When utilities need
flexible capacity they can count on, they turn to Energy Hub. Energy Hub works with more than 170
any utilities, coordinating over 2.5 million devices to manage 3.4 gigawatts of flexibility,
built for the moments when utilities can't afford uncertainty.
Energy Hub builds and operates virtual power plants that utilities actually stake their grid
planning on, coordinating EVs, batteries, thermostats, and more through a single platform
built for utility scale.
Predictive, verifiable, and designed to perform when it counts.
Learn more at energy hub.com.
Trillions of dollars are flowing into clean and critical infrastructure, but those investments aren't driven by technology alone. They're shaped by markets, by policy, by capital, and by the institutions that connect them. I'm Alfred Johnson, CEO of Crux, and host of a brand new podcast, Critical Capital. Each episode, I talk with people deploying capital, shaping policy and building the clean economy. Tune in as we unpack how progress is actually made. Listen to Critical Capital on Spotify, Apple, or wherever you get your money.
your podcasts.
Hey everyone, Daniel Waldorf here. I'm the producer of the show. A quick note, in this
episode, Shale talks with Duncan Campbell of Scale Microgrids. Scale microgrids is a sponsor of
Catalyst, but this episode is editorial in nature and not connected to that sponsorship in any way.
Okay, on with the show. I'm Shale Khan. I invest in revolutionary climate technologies at energy
impact partners. Welcome.
Okay, so let me preface with this.
I'm totally convinced that this new wave of AI technology driven by large language models will change the world.
It already has changed the world, really, so I realize that's pretty easy to say.
But I want to say it nonetheless because of the second thing, which is that I have generally been a bit more skeptical than most, I think,
that it would fundamentally change, at least, the things that I spend my professional time on,
which is to say climate tech and energy.
Sure, I think it is obvious that some back office tasks will be totally automated through these tools,
and I've already used it to come up with some surprisingly good ideas for decarbonizing tough-to-abate sectors.
There are applications.
And sure, there are tons of additional theoretical applications that are fun to think about,
things like, say, materials discovery.
But I've been waiting for someone to show me a real-world application that is available today,
that I could imagine would lead to a real transformation in energy.
So I got sick of waiting around and figured, why not just test it out myself?
And fortunately, Duncan Campbell was a step ahead of me.
A couple months ago, on a whim, I think, Duncan played around with ChatGPT to try to write some
Python code to dispatch an imaginary battery using a few key conditions.
It's actually a real-world challenge that real companies have been built around.
how do you charge and discharge a battery, optimizing it against real-world conditions in a wholesale
electricity market. Now, Duncan works at scale microgrid solutions, so he does know a thing or two
about batteries and about energy, but importantly, he is not a software engineer and he does not write
code. And yet, he wrote some code. It was simple, and it didn't really work, but it was something.
So I figured, let's take this to the next level. So we recruited Syed Madeni,
to help. Syed does write code, and more importantly, he is one of the world's leading experts
specifically in writing software to dispatch batteries. He led that charge at former EIP portfolio
company Advanced Microgrid Solutions, which was then acquired by Fluence, where he continued
leading that charge on their algorithmic battery dispatch before starting a new company recently called
VERS. So, Syed knows battery optimization code. And we had Syed come up with a better challenge.
for Duncan, one that reflects real-world conditions, one that we might actually assign real
software engineers to solve. So let's see what happened. Here's Sayad and Duncan.
Syed, welcome. Thank you, and thanks for having me. Really appreciate you setting this up for us.
And Duncan, thank you so much for being our sacrificial lamb. Totally, I'm going to out myself as
needing AI to do anything. Or being really good at the interface of the future. One of those two,
probably, and we're going to figure out which one in the next hour or so.
So we have a lot to talk about here.
Syed, I kind of want to start with you.
So we basically asked you to come up with a challenge that we could give Duncan to use
these new large language models to do a battery optimization.
We asked you to come up with something that's sort of real-world type scenario and an actual
challenge that you might have in a previous or current life assigned a software engineer
or team of software engineers to tackle.
So can you walk us through the challenge you came up with and sort of the thinking behind it?
Sure, absolutely.
Let's start with the fact that there are wholesale markets.
And the wholesale markets that we've focused on today is Kaiso.
So if you are a market participant in Kaiso, you could be in utility, you could be a trading shop,
you could be an IPP, you have to bid assets into the Kaiso market.
and traditionally bidding in conventional assets like thermal plants has been relatively straightforward,
but energy storage creates a whole host of challenges for market bidding, because at the end of the day,
energy storage is a used limited asset.
As an industry, we've had experiences with hydro, but that's a different problem.
Your opportunity of trading power within hydro plant really ranges from season to season.
you're trying to make trade-offs between using the water now or six months from now.
But with storage, it's a bit different.
You only have two or three or four hours of energy in the tank.
So the problem that I came up with, it's very practical.
It happens day-to-day on a trading floor.
Just imagine you're running a trading shop.
You're dealing with wholesale markets.
And you have a new asset called an energy storage.
And the goal here is to competitively bid that energy storage.
storage asset in the wholesale market. For our listeners out there who don't know what bidding in a
market is, is essentially putting a price on the power that you're willing to sell in these wholesale
markets. And it's just not power. It's a lot of different things. It's energy and saluted services,
in some cases capacity. So it's a pretty complicated problem. But if I want to simplify it,
what we're doing here, just imagine the director of trading, tapping on your shoulder,
and saying we have this new energy storage asset, it's 25 megawatts, it can run for four hours,
think about how we're going to bid it in the wholesale market. So that's the problem.
Now, in a real world, the way that works is you turn into your software team or your trading team
and ask them, can you bid this asset in the wholesale market for me? So the end game is bidding,
but you want to break it down into simple problems. And the first step of it is, how do you
the heck are you going to use this asset? So how do we even come up with an optimal schedule for this
asset? And from there, we can think about market bidding and trading and whatnot. So the problem I've
defined is take a 25 megawatt battery that's good for four hours. It's located in a trading hub in
Kaiso called NP15. In reality, it'll be on a node, but we've simplified it. MP15,
come up with the optimal schedule for next day or the day ahead market.
We're going to put aside all the complexities around market bidding and trading,
and the idea is to take this energy storage asset and come up with a good use case for it for tomorrow's market.
How do I use it for energy and insularity services?
And to be clear, basically what needs to happen,
you laid out a list of acceptance criteria for this code.
And basically what needs to happen is that the system, the code, needs to be able to pull day ahead energy and ancillary service prices from KISO.
It needs to be able to incorporate sort of various factors, load patterns, historical data, maybe weather data.
It needs to process that information alongside the parameters of the battery, which you've set out.
You said 25 megawatts, four hours.
You also gave us a cycle limit, round-trip efficiency, minimum state of charge, all the sort of keep.
parameters of a battery, and then it needs to have an algorithm to optimize the schedule for
charging and discharging that battery for the next day, print that schedule, and forecast the
revenue. Is that basically right? Correct. With one caveat, I asked it to forecast market prices for
tomorrow, and let's see if it can do it. But essentially, the pass is the history, and tomorrow is the
industry. So a big portion of what quantitative shops do is try to predict the randomness of tomorrow's
market. So I've kind of asked it to do it. So we'll see what Duncan and Chat ChaptiPT can come up
with, but that's really the ask. So let's contextualize it then. In a world without ChatGPT,
as you said, somebody head of trading will tap some folks on the team on the shoulder and say,
I have this new asset and I need you to do this exact task. Like how hard a task? How hard a
task is this, how would you normally go about doing it? Who is required in order to do it and
like how time consuming a task would it be? Yeah, I think it's a very time consuming task.
Traditionally, a lot of trading shops who have used, you know, status go and Excel spreadsheets
really are lagging behind utilizing technology. And there are a lot of great companies out there
who actually are doing it systematically. I'm proud of my team at Fluence. They're actually
doing this. And it's a hard problem.
Tesla folks are doing it. A lot of different companies are actually paving the way for market bidding and market trading. So I would say it's a very complicated task.
Can you just give like a high level, even order of magnitude estimate of like person hours maybe that you would estimate that it takes to or time?
You know, if you would assign this to your team, how long would you have given them to turn it back around?
I think there's always an answer. It depends on what you're looking for. Because you're,
you can come up with something over the course of 12 hours, but trust me, it's going to leave a lot
of money and value on the table. Or you can actually systematically procure software or build software,
which is a, in my view, a year endeavor. Yeah, if I can jump in, I think there's a big distinction
between using the tool you've already built and doing like a model run of it or actually like
building the analysis from scratch or using a third party's tool of some kind. Right. And so we're
building the analysis from scratch and building the tool to implement that analysis,
or that was the attempt. So, Duncan, let me turn it to you for a second then. When you got the
prompt from Syed, let me ask you this, I guess, to start, in the absence of a large language
model, would you know how to do this? Probably not, no. So the team at scale I run, people on the team
can do this. Me? No, I can, you know, barely string together.
few scripts here and there, but I'm mostly a legacy Excel guy. Compounded on that, I work in the
behind-the-meter world, which has very different price signals. A lot of the same techniques probably
being employed, but we're not like forecasting LMPs, right? We're forecasting load for demand
charge management, things like that. So what was your reaction when you got the prompt? Like you had,
you have the benefit of sort of understanding what the challenge is and the benefit of having played around
a little bit with chat GPT to generate code historically.
So did you look at this and think, oh, like, this is going to be a breeze?
Or did you look at it and think, oh, God?
I think right in the middle.
Like, honestly, it, you know, right, it involves forecasting prices based on a bunch of time series data.
It involves setting up the problem and the constraints and the variables and then applying some
kind of optimizer to it and then, you know, parsing all the results and producing outputs that
are useful, right? That all, it felt doable if, for example, I had a couple analysts working
with me who are good at this stuff, right? The question is just like, can we remove that and still
see some level of success here? Right. Okay, so before we dive into your process then,
say, had anything we haven't covered on the challenge that you think is important to note?
No, I think we're good. And Godspeed, Duncan. Let's see what you got.
All right. So you've sent us, Duncan, like, literally hundreds of pages of printed out, basically conversation between you and chat GPT that was the result of this challenge. How, let's start at the beginning. So what was your first attempt? Yeah. So I've played with chat GTP a bit for other stuff. So I kind of had like an intuition of the way that would work, but I tried to sort of ignore that. And my first thought was,
Well, the coolest outcome here would be if I just copy and paste it Said's prompt, stuck it in chat, GPT, and got the right answer, right? Just like one and done, right?
It's going to make it a very short podcast if that works.
So that was my first attempt. I literally just copied it, pasted it. And the result was actually pretty interesting. Like, I think it broke down the problem very well. It responded with, okay, first you're going to need to, you know, deal with the forecasting, price forecasting element. Like, we recommend using,
like regression models, here's a Python package that does that well. Then it said like,
okay, next you have to set up the the sort of like dispatch problem and math and variables
and constraints and optimization. Like here's what we might suggest doing. But it didn't like
write out the code and make it all work, right? It just, it presented a guide basically of like,
here's the four parts of the problem and maybe a place to start. And I think you quickly realize
like if you just reply like, okay, now make the code using that guide, it gets very complicated.
Like it's very hard to kind of like top down build it because there's a bunch of little errors
everywhere. It's like hard to track what's going on. And I pretty quickly reverted to like bottom up
building it, like building one element, like slowly working through it, getting to something that
works like layering on the next thing, etc. And generally that's been my that's been like my
general experience with chat GPT.
It's very hard to just assume it knows everything and can execute.
You kind of have to like teach it what to do.
Yeah, that's been one of my big takeaways so far too, is you want it to solve a big problem
for you and it can potentially, but you kind of have to walk it along the path to get there.
You can't just ask the big question.
If you ask the big question, something is going to go horribly wrong.
I'm curious on this first, on the, Duncan called it the buckshot approach.
Like, did you see anything that was interesting in chat DBT's response, or do you think it's the wrong way to prompt?
No, I think, Duncan, you said it well.
It's all, you know, comes down to how you're prompting it and how you're guiding it.
When I looked at the code, I quickly realized that there's still a human involved here.
This can't be really a one-stop shop.
And Duncan provided some guidance and perspective here of how to break down the problem.
And that to me means still it needs to be humans involved.
And the other day a software engineer was, I saw this quote online that was saying that,
hey guys, as software engineers were safe, because users still need to define what they're looking for.
And that's really the case here.
Yeah, it's an interesting, I mean, I guess we'll see at the end.
of this, how safe exactly software engineers are. Yeah, exactly. But it does seem clear that you can't,
you can't just give it this super high level. You literally, I mean, you wrote us,
say, at like a one-page challenge here. You can't just copy and paste that challenge and get the
answer that you want. Correct. And keep in mind that the challenge that I introduce still is not
even tap into bulk of the complexity, which is market bidding. We're putting all of that aside.
We're just thinking about like a simple schedule. Right. Yeah, I think the takeaway from the
buckshot approach is basically like chat gtp is not like a 10 year industry veteran with deep domain
expertise right it's it's a decent sort of like junior software engineer and its benefit is that it's
very fast right but you can't just sort of like let it take the reins and run with the problem
assuming it has all the industry context to like know where to go with that well let's let's find out
whether it's actually a decent junior level software engineer,
because we haven't yet proven that it can do anything.
So you, okay, so you went down the buckshot approach for a minute,
realized that that's the wrong strategy.
So talk about how you attacked it next.
Yeah.
So then I thought, like, let's sort of do the inverse of that, right?
Let's like really narrow the scope of the initial problem I'm giving it.
And in addition to that, like, let's build up the concepts, like from the fundamentals.
actually describe how things should work in words. This is a language model, right? So let's like
teach it with language about, you know, the constraints, about how dispatch works, about you can't
dispatch, you know, at a power level greater than your maximum power level, all these kinds of
things, like walking it through the, how you would explain the math, basically, and trying to keep
the scope of what it was doing kind of tight. And yeah, it seemed to produce better, but also
kind of like more understandable results that you could actually kind of then like dig into and
start to debug, et cetera.
Do you just talk through like as you're teaching it, are you requesting bits of Python code
each time to reflect that?
Are you just saying, here's a thing that I think you need to know and it would say,
yes, I understand.
And then you say, you build upon that.
Like what was the actual process like?
Yeah, I mean, in this second approach, yeah, the first message was not just little bits.
it was like one, two, three, four, five, six, seven, eight paragraphs, kind of like describing the bulk of
what's going on here. And I actually tried to give some examples too with some like basic just arithmetic
to demonstrate what's at work and let it interpret how to turn arithmetic into algebra. Yeah, I tried to really
sort of describe how it all should work up front. One other thought I've had about the world. I'm
curious your take on this too, say I'd like, are software engineers safe or are our software
engineer is just going to have to become like essay writers? Like, is pros the new coding?
Yeah. No, I sometimes I kind of, you know, I've been thinking through this. And the more I think about it,
at the end of day, we're flying a plane that's, we're souping up the autopilot system. But I think
we still need pilots. And, you know, a pilot with an autopilot system is a different kind of a pilot.
So it's mostly about, I wouldn't say essay writing, but prompting and code review and QA
But we also need to consider, I just want to bring this up front, that we're thinking about a simple problem here, and we're thinking about a simple script.
But in reality, the machines that we've built in terms of computer software and programs are very large.
We haven't concluded that.
Even if ChatGBTGBT is producing something that's flawless, still embedding that into your entire software ecosystem, it's a bit of a
challenge. And I don't personally feel comfortable uploading my entire software architector to chat
GBT and asking it to analyze and incorporate this piece of code. So I think when it comes to
developing software, the architecture, the components are a bit piece of the work. And we're just
zooming in a very small problem. Okay. So, Duncan, you do this second attempt that is
building up the story a little bit more for the model. It is defining the parameters a little bit
more clearly. What are you getting as a result of that as you're going along and how is it adapting?
Yeah. So that first result I get from this approach, you know, I give it my like eight paragraph
sort of description of optimizing battery scheduling is is definitely more.
specific than what I got before, right? It actually, you know, picks certain Python packages to use,
shows you how to import them and name them. It's defining the variables. It's like actually producing a,
whether or not it works, like a script that is sort of fully developed, right? I think you relatively
quickly, you know, like just like in the bugshot approach, it's like, okay, cool, I'm going to
copy and paste the script into Jupyter notebooks and see what happens, right? And,
you get errors, right? It's not like ready to roll. And some of the errors are like obvious.
Others are challenging to deal with. But anyway, yeah, this approach at least produces something
that is like on the path to usable as opposed to just like a sketch of what you should do.
So looking at that version of the code then, Syed, is what you're seeing like a mess as your
reading code? Or is it like, oh, this is a script that has,
that needs some debugging, but like the bones make sense.
Yeah, I think the code is pretty clear, and it's, you know, almost close to reality.
But I also want to go back that it's clear also that somebody prompted to produce this piece first,
which is a big head start.
But when I look at the code, pretty clear.
Can you just explain that a little bit more?
What is clear about it that somebody prompted to develop this piece first?
What do you mean by that?
Yeah, so essentially, if you're, let's say if you're building a building, well, you've got to think about the foundation first.
Then the next step of the process is, you know, putting steel in the ground.
So you just don't go and work on cosmetics day one, right?
Same thing for coding.
You really need a cohesive theme of how you're defining your parameters, your variables, the packages you're downloading.
the data scripts that you're using.
So it's followed an actually very clear process in terms of what the objectives are.
So, you know, I want to say kudos to Duncan or kudos to JATGBT or kudos to both.
But at the end of day, it's pretty clear.
Okay.
And so Duncan, you mentioned that, okay, so there's a bunch of errors, though, in this code.
It's sort of the bones work, but the errors there.
Can you just give an example of the type of thing that's like broken in this version?
of the code? Yeah. So it, right, if we're taking the approach of I am going to fully inform
chat GPT of everything and it's just going to write the script for me, anything I miss,
it missed, right? So, for example, I forgot to inform it that a battery cannot simultaneously charge
and discharge. So it did not apply that constraint, right? Which is an important point, right? Like,
you have to say things that, like, seem obvious to us humans, but, but not necessarily obvious to a
model. And I'm sure a programmer could make the same mistake and then realize, but you know, run some
results and see the results and go, oh, yeah, that's a constraint I have to apply here.
But if I want to add something real quick on that point, you're right. Batteries can't charge and
discharge at the same time. And it's rather obvious to us. But the same applies if we bring a software
engineer out of, as an example, out of Uber, where they've never heard about energy problems,
can't distinguish between megawatts and megawatt hours.
It's not that they're not smart.
It's just they're not being exposed to the problem.
They're going to miss that too.
And so I kind of see it as the same.
A very similar or another sort of debugging exercise I had to do is I told it that it had
certain cycling limitations, right?
Generally, there's some amount of usage of the battery that's sort of like permissible,
either by the warranty or just your own view on how it will degrade.
And so at this point, we had applied a one cycle per day limitation.
And I said, yeah, cycles, when you fully charge it and fully discharge it,
turns out like mathematically defining that's a little funky.
And it really didn't understand what I meant by that.
And its initial response basically had the battery doing half a cycle per day
because of the way it interpreted how I informed what a cycle was.
And I actually didn't even realize it until I later added code to print charts that easily showed like the SOC over a day.
And I went, oh, wait, there's no way that's optimal.
And kind of thought about it a little bit and had to go back and say, ah, wait, let's figure out how to define cycle.
And it actually took a couple attempts because it's very easy to say, but a little funny in actual mathematical terms.
That's human-centric QA.
That's what a product person does.
I mean, because at the end of the day, the code isn't producing any bucks.
is producing results.
Just bad ones.
The results, though, make intuitive sense,
and that's why still somebody needs to be looking at the end game.
Yeah, and that similarly strikes me as a problem that, like,
defining a cycle is, would be true,
a challenge that would be true for any software engineer,
as it would for the model.
So I think what we're finding is, like, the model doesn't solve,
the model can't solve things that your software,
your Uber software engineer wouldn't know how to solve.
The model's not smarter than the software engineer,
but it is building code that works as prompted via,
you know, text from a person who doesn't know how to code, really.
Absolutely.
Yeah, and I actually tried to, I tried to assume it was smart.
So, like, when I noticed this was an issue, not an error, but a bug,
I just described to it.
I was like, hey, it's only doing half a cycle,
and, like, clearly that's not.
optimal, like we must be defining a cycle wrong and just tried to let it update the code.
And it took me down like a very bizarre path that didn't make any sense.
And I had to kind of be like, okay, let's go back to three versions ago.
Like here's where I think the issue is.
And basically like tell it the answer, right?
It still wrote out the new updated code that worked with the proper syntax and all of that.
But like I kind of had to tell it exactly what to do.
because otherwise it was just like going down a very just like unimportant path that was not going to lead to success.
And I found that repeated, like often.
It's in debugging, if you kind of just like keep prompting it, like, fix this, fix this.
Here's the error.
Like here's the message I'm getting back in error.
And you just kind of like assume it's going to be smart.
You can wind up in like rabbit holes that just don't make any sense at all.
Yeah.
Okay.
So at this point, you're debugging.
some semi-working code that's focused, as I understand, it's still on basically like, optimizing
battery charge and discharge for a single day. And then the next challenge is basically to say,
okay, well, we're not trying to just operate this battery for a single 24-hour period.
We're trying to optimize this battery every day. So now you have to make it work over and over and
over again. And this actually is a good example of, regardless of how good your chat GPT software
engineer or human software engineers, I might have misinterpreted Sade's prompt a little bit.
Because now, I think he might have been looking for something that you run every day to
produce a new schedule. I was more in like pre-construction like project analyst mode where I'm like,
well, what's this going to look like for a year? Like, what are my revenues going to be for this battery
to justify its cap X? So there's still like even the communication and relationship between
in this example, my boss and me, like, has to be good.
Yeah, exactly.
Everything can break down downstream of that if it's not a good understanding of the problem.
But I can tell you this.
If you can solve it for a year over and over again, solving it for a day repeatedly isn't
much of a challenge.
So I guess the underlying nature of the problem is the same.
If the battery had enough, you know, energy to kind of cycle for a couple of days,
If we're thinking about long duration storage, distinguishing between a full year analysis versus a daily analysis is important.
But if your battery has four hours, it really doesn't matter.
So yeah, the next, the immediate problem you run into when you want to do more than a day and you're just a guy with chat GPT is, you know, I can just fill out 24 hours of made up LMPs, right?
But if I wanted to do three months, where am I getting that data?
LMPs are locational marginal prices for anybody who's not deep in wholesale power market world.
Or not even three months.
How do you get it for next day?
So it wasn't clear to me.
I don't think you actually came up with the forecasting, right?
You're just using historical data, which is, you know, a bit of a problem.
Never dealt with forecasts.
Yeah, just didn't get there.
I think it totally would be doable.
But yeah, just never got there.
And curious, when you actually asked, did you ask, can you forecast?
And what was the response?
In the buckshot approach, the original approach, I did ask it to forecast.
And it talked about, you know, regression and stuff.
Yeah.
And, you know, I talked about how the forecast should incorporate historical load,
historical temperature, like solar irradiance, like this and that.
But we just never really got there in this second approach where I was trying to like break the problem down into pieces.
I got as far as connecting it to a service called grid status,
which can give you a bunch of historical data in an easy way via API,
and got as far as running the optimizer on a bunch of historical data,
but then not sticking the forecast step in between where,
rather than running it on historical data,
you're using the historical data and other data to forecast,
and then running it on your forecast.
Got it just didn't get there.
So, yeah, I mean, I would characterize that as a, you know, major area of development and improvement that's needed.
But still, I mean, just connecting to the right APIs and coalescing the data and creating that repository to do backcasting and running out on historicals, it's still a great step forward.
It strikes me that the likely, you know, if you're dedicated more time and tried to do the forecasting bit, it feels fairly straightforward to me, given what we know about what changes.
chat tbtee can do right you you would have to define like i don't think you could just say forecast this
given these parameters or you could but that would be the buckshot approach to forecasting and you'd probably
it'd probably be kind of a mess i think you'd have to be more specific right it suggested uh regression
like you could say okay please run this multivariate regression using these as input parameters
you know and give me a forecast for uh for day ahead lmp prices and then plug that into the model my
guess is, I mean, tell me if either of you feel different, it probably could do that. I don't know if it would have the world's best forecast, though, right? Like, unless you're defining it on an algorithm you've somehow developed outside, like, it's just going to give you a simplified version of a forecast based on whatever you, however you tell it to do the forecast. Yeah, I mean, I agree with you. But at the end of the day, the quality of a forecast drives the quality of your decisions. And if the idea is maximize my position in the market,
You don't care about forecasts and optimization.
The end result is maximize my position in the market.
And if you're using like a wonky forecast, then, you know, part of me, but garbage in, garbage out.
Yeah.
I think I think it definitely could have built a forecasting, you know, approach.
Part of this too is I am less familiar with that.
Like I'm pretty familiar with optimization problems, but like regression forecast, like, that's just not a thing I know much about.
So I didn't know whether, should we try to build this up from scratch, you know, connect it to all these different data sources, use some kind of, you know, Python regression library.
Or is there like just some good open source energy price forecasting out there?
Like I didn't know how to kind of point the weapon, you know?
Like, so I opted for skipping that and just moving on to more fun stuff.
And that's the part where large language models sisters come into play.
and those are like deep learning models or reinforcement learning that you're trying to predict uncertainty and randomness in the future.
But it's clear that I don't think that LLMs are equipped with, you know, building a whole neural net from scratch and training him and do some form of supervised on supervised learning for forecasting.
So that's another area of, you know, improvement.
But this thing has been a six-month-old endeavor, chat, CBT.
So we're just scratching the surface.
Right.
And I knew I wouldn't even know where to like get the right data, right?
Like I could get as far as like I knew I could DM Max from grid status to get an API key.
But like where should I get the weather data to build a good forecaster?
Where should I get the like I just, you know what I mean?
All that stuff is I'm not familiar with.
So yeah.
Yeah.
It just seemed like a actually more challenging than building the optimizer.
Right.
So okay.
So at this point in the endeavor, like what do you have?
What is this thing?
What is the code that chat GPT has given you able to do?
Yeah, so after that, we got any period of time working beyond 24 hours,
just whatever amount of prices you fed it,
whether those be historical, forecasted, whatever, it would do it for that long.
Yeah, we got it hooked up to grid status so that it could just pull those prices.
And in theory, you could pull for other nodes very easily too.
I kind of hard-coded in this particular trading hub, but you could put any of them in.
Oh, then once we finally got the optimizer, like really working, you know, every little edge case that was going wrong was solved.
You know, another big challenge is what do you do with it, right?
Just the fact that the optimizer worked is one thing, but like actually parsing all of that, that schedule you've created and present creating outputs that are useful and charts is its own whole like finicky journey.
So then I started digging into all of that.
that is challenging. And that actually took longer because that's all just about like knowing plotly
and all of its functions and all the syntax to use. And I don't know any of that. So that was its own like
long journey of annoyance. Can I take a sign of dalliance for one second? I just noticed as you were
describing what you were doing, you're saying we got it working for multiple days. We did this,
we did that. I think when you say we are referring to you and chat, GP,
Is that right?
Did you feel like you were like partnered with the model building this together?
Oh yeah, for sure.
It felt very similar to like trading Slack messages back and forth with like an analyst on my team.
Now, it wasn't as fun, right?
But like it, you know, but it's, yeah, it's the person who's working on this with me.
It's just super interesting.
You know, it's just chat cheptie.
Yeah.
Virtual power plant.
are becoming a reliable way for utilities to manage capacity.
But enrolling devices is just the start.
What really matters is confidence,
knowing those resources will perform when dispatched
and being able to prove it,
from the control room to the living room.
Energy Hub's platform handles the full picture,
from near-real-time forecasting,
locational dispatch,
and the kind of rigorous verification
that holds up when regulators,
grid operators, or leadership,
ask, did it deliver?
Easy enrollment creates momentum,
proven performance builds trust.
That's why more than 170 utilities rely on Energy Hub to manage over 2.5 million devices delivering 3.4 gigawatts of flexible capacity.
See what that looks like at energy hub.com.
We're living through a profound economic shift, and energy sits at the center of all of it.
Trillions of dollars are flowing into power plants, transmission lines, battery factories, data centers,
but the future of energy isn't shaped by technology alone.
It's shaped by markets, by policy, by capital, and by the institutions that connect them.
I'm Alfred Johnson, CEO of Crux, the capital platform for the clean economy.
Join me for my brand new show, Critical Capital, as I talk with people deploying capital, shaping policy and building projects.
Together, we unpack how risk is priced, how incentives are structured, and how progress is actually made.
Listen to Critical Capital on Spotify, Apple, or wherever you get your podcasts.
Well, you're describing at the end there is basically that part of the prompt, part of what
Sayyad required one of the acceptance criteria was, I'll quote it, the system shall generate and print
the optimized schedule and associated forecast market settlements.
So part of the mandate here was not to just be able to build an optimizer, but have it
basically tell you here is what you should do the next day and print that all out for you.
And you were adding on to that having it actually chart out charge and discharge over the
course of a day, which I think was smart to do because it sounds like.
like you found some bugs by just seeing those charts.
Yeah, it was really, the chart was just for debugging purposes, because you can intuitively
figure out there are things going wrong based on looking at this, such as, like I was saying,
it was only doing half a cycle a day before when I knew it probably could be doing more.
Or even when we fixed that, it was like, wait, why is it doing 87% of a cycle?
Oh, because we didn't incorporate charge and discharge efficiency into the cycle definition
perfectly. So yeah, the charts were kind of like for that purpose. But yeah, you had to make a nice
table that, you know, whatever shows like you're charging and discharging and total revenue.
And then you realize you have to, right, since the period of time this analysis is going for,
whether it's one day or three months or a year, is variable like what's the right table to show?
So you have to kind of build in like different output scenarios depending on the amount of time being
analyzed.
So just that was the most, for me, that was the most frustrating part.
Just like getting all this like stuff to like work to make it look nice and be functional.
But long story short, you were able to do it.
Basically, at the end of the day, what we've got here now, I think this is the end of your
your journey, at least so far, is a working optimizer that does schedule and print a
dispatch algorithm for a battery a day ahead, given at least historical market prices, not necessarily
forecasted market prices. Is that basically right? Yeah, and importantly, it ignores ancillary
markets as well. It's just doing energy. Right. So this is a key point. Say, and your initial prompt
that you laid out that the system has to accurately forecast both energy prices and ancillary
services prices, ancillary services being a more complicated market than energy. And Duncan,
you didn't get to the ancillary services stuff. Yeah, not only did I not get to forecasting it.
I didn't get to incorporating it into the optimizer either. And why not? There's a little bit of just
like time and energy there, similar to this, like how I didn't get around to forecasting,
but similar to not getting around to forecasting. It was also because I didn't have a perfect
understanding of how that really works. And therefore, I knew it would be quite difficult to like,
prompt it all the way, right? In this kind of like bottoms up, explain everything approach,
I didn't have a good sense of how you really like marry those two markets and, you know,
day ahead energy versus ancillaries, you know, knowing that ancillary services are like very
sort of like high frequency type of things, like how that comports to pricing and dispatch.
you know, I'm doing an hourly interval model here.
Like frequency regulation is much more, you know, high fidelity than that.
How do I really deal with that?
I just, I didn't have a good sense of how to approach the problem.
And as I guess is apparent from other things that have come up, I just decided to skip it.
Say, did you know kind of when you were setting up the prompt that, like, energy would be much easier than
ancillary services would be?
Yeah, energy is absolutely much easier.
And yes, Inslee services, they run on a second by second basis, or there might be just reserve capacity.
But at the end of the day, they are traded on an hourly block or 15-minute block in real-time.
So I would say if I think about the initial problem, which takes a year for a souped-up team to build and product dyes, I originally reduced the complexity to 50% by ignoring all of the bidding stuff.
and from the 50% that was remaining,
the insular services are probably like 40% of the complexity.
So essentially take the problem that a user ultimately needs
and reduce it to complexity by 90%.
And this is what's being generated.
But that being said, I'm super excited about it.
And at the right time, we'll talk about it.
Maybe we'll look at it the code together.
But from a complexity perspective,
it's pretty like scaled down.
So I think I think that was the time.
I think we should look at the code, the final code.
And Sayad, I think you should give us, I'll say we.
We deserve, we being Duncan, chat GPT, and me for sitting here,
a grade on performance relative to the prompt.
And then I want to talk a little bit about like,
sort of you already started doing this, Syed, like laying out like how big a deal is this
in terms of time saved and capabilities that.
it offers. But let's look at the code first. Let's see it. Okay, let's do it. I haven't been in school
for about nine years, so I'm a little nervous right now. I don't think chat EBT has been there either.
Here we go. So just for those who are listening and not watching, what are we looking at here?
So this is a Python notebook. I happen to use Google Collaboratory, I think it's called, as opposed to
like Jupiter notebooks, which is pretty common. But either way, it's just a little web-based
environment to edit and run code.
Okay.
And walk us through it.
Sure.
So basically, it's broken down into four parts.
Initially, we have a part called defined battery parameters, which is a nice, it's a little
form for you to enter, you know, how big is the battery?
What's its, you know, power, the maximum power output, you know, charging efficiency,
discharging efficiency, cycle limits.
You know, basically, what are your inputs, right?
your user inputs, right?
So that's first.
Next, it does the day ahead energy price retrieval.
So this is where it talks to grid status, says, I want NP-15 trading hub this many months, etc.
We could have made a form here so that you could change the trading hub, change the date range, etc.
I just never really got there.
Third is we have the dispatch optimization then, right?
We have all of our battery parameters.
we have all of our prices.
Let's define the problem, the math, and run the optimizer against it, and the constraints,
and run the optimizer against it.
And then last we had just parsing all of those results and creating something that can be displayed.
We can show the actual code if you want.
Yeah, let's do it.
Is that useful?
Okay.
Nothing proprietary here.
First section is obviously super simple, right?
We're just defining these battery parameters as variables, giving them inputs, and then
this sort of thing right here is just what makes it a form super easy here you won't believe it just
this simple task the the magic of fatty fingers and you know typing stuff incorrect and later on
the debugging event that's a whole process that a human will deal with and we've just eliminated
all of that with automation here so i just wanted to like call that out sometimes simple problems
are hard to deal with and you're saying in this case it did it right yeah it's pretty clear because
it was prompted right. Yeah. And it does kind of, just simple, clever things, like SOC underscore
Max, that's not a variable name I came up with. I in my detailed description at one point said,
like every battery will have a maximum state of charge that is defined by blah, blah, blah,
and it decided SOC underscore Max was a good variable name for that, which is like...
Exactly what you would want it to be. Yeah, totally. Yeah, you can totally feel that. That's how a human
would label their variables.
Yeah.
Then we got into the, yeah, this day ahead price retrieval.
You know, I had to go on the grid status website or on their GitHub and like find.
I think what I actually did is found an example and just pasted the example and said,
okay, make it work for what I want, which was pretty clever.
It seemed to do a good job there.
I also want to shout out for Max at grid status for providing this because I don't know if he
didn't do grid status.
how would you do this? It's not easy, especially like connecting to Oasis in Kaiso.
Sorry for my Kaiso colleagues, but it's just a nightmare. So kudos to Max.
Yeah, think about it. I'd have to download all this stuff from Kaiso, or I could try to connect,
but I knew that was a dead end for me. So I would have probably downloaded all this stuff,
saved the CSV, like uploaded the CSV, which has its own sort of like little procedure and
syntax and stuff. This made it vastly, vastly.
simpler to just sort of pull some data in a structured format.
So if I want to just maybe comment on section by section, and let's just give it a 10 out
of 10 kind of at 10 point system, I would definitely give this 10 out of 10 for retrieving the
data using this code.
So we didn't do the 10 out of 10 on that first section.
So I think this will be good because we'll give it a total score.
So on the system parameters also 10 out of 10?
10 out of 10.
I would do 10 out of 10.
I guess maybe there's a caveat to the first section, if you scroll up, Duncan, as we are defining the parameters.
The only limitation there is it's just ignoring EnSleeve Services.
So if I was the manager of a team and I looked at it and I was like, oh, this is great, but what happened to Insleeve Services?
So if I look at it, ignoring enslaved services, if the task was just focus on energy a 10 out of 10.
But the fact that it's missing in slew services, that's just like a bit of a blow.
If I want to consider that, then I would say, okay, it's two or three out of 10.
But the style of the code and the clarity is definitely very superior.
Okay, so we're sort of 10 out of 10 or 2 out of 10, depending on how you look at it, the first one,
10 out of 10 on the day ahead price retrieval.
Let's go on to section 3.
Let me just ask one really quick thing,
say, is this well commented?
Like, it obviously works and maybe it's written in a, you know,
mechanically good way.
Is it like well structured and commented and clear and all of that?
I would say I would give it as the 90th percentile of clarity and comment-wise,
because you don't want to like be overly loaded on comments.
This is not a confluence page at the end of day.
this is code. The product manager's job is to really do a lot of the documentation. But in terms of
the commenting structure, clarity of the code, definitely a 10 out of 10. So far so good.
So far so good. One thing that was annoying in this stage here was it would continually forget
to install the right packages. And so if you just copy paste the new code, you'd have to remember like,
oh, wait, what did it was installed? So I think I ended up with some like vestigial packages at some point
that I didn't need.
And like that, managing all that was annoying.
Yeah, right, right.
And we'll get back to that, too, I think, in the context of like, would you and how would
you actually use this in the context of a broader, like a real world system that you're
trying to build.
But let's get through the rest of the code first.
All right.
Cool.
So the third section dispatch optimization here.
So, yeah, this is kind of the, I guess, like mechanically, the meat of it, right?
and there's a little more going on here than the previous sections, right?
I mean, we can go in any direction here, I guess.
Where do you want to start?
Yeah, I would say, like, my initial reaction would be, well, holy shit.
Like, this person not only did backend stuff, but it's writing optimization code,
which typically is done by two different individuals.
Software engineers do a lot of the backend, data engineering, post-processing and results,
putting it in the architecture,
that's why you go and hire optimization engineers,
people with OR backgrounds and math models,
that can write optimization level software.
Now, I'm looking at this, and I'm realizing,
who's this unicorn that not only does backend stuff,
but it's also doing optimization stuff.
And literally, those are very hard people to find.
So this is where I get excited.
Now, realizing we're not doing insulin services,
the problem, if it's just charge and discharge, you can probably do it in Excel.
But defining optimization objectives, decision variables, constraints, how you're solving it,
it's importing the right libraries.
It's a job of an optimization engineer.
So I'm super excited for this.
And I think part of why, yeah, the optimization engineer is scarce, right?
is because they kind of need to understand, like, the battery system, like writing the objective
function, just what's actually happening here. You kind of have to know what's going on.
And it seemed to do a good job sort of learning that from my just very human description of how it all
works. So if I want to comment on this real quick, the way this optimization model is written,
it's textbook, classic academic style of doing things. You create. You create.
parameters, thinking about number of days, number of hours. It's defining total cycle limits.
The only thing I would change here is distinguishing between parameters and variables.
Typically, the custom format is parameters are all capital letters and decision variables
are all lowercase. So it hasn't done that, and that may create a little bit of a confusion.
So I'm kind of slightly disappointed there, but who cares?
So then he goes to variables.
It's defining linear variables with a lower bound of zero, maximum as you know, charged and discharge limit.
It's properly defined a string for your state of charge.
And then it's going into defining a problem.
The other downside here, it's defining an LP problem as you are growing LP is a linear program.
But as you are doing the full-blown thing, you always want to think.
about binary decisions, what state is this battery in?
And that's going to be an integer program, which it hasn't done.
But again, if the problem is just ignore ancillaries, this is well enough.
Then the next step of the process, defining your objective function, you know, a linear sum is subtracting charge and discharge decisions, is summing it up correctly.
Then your constraints, these are all perfectly linear constraints, nothing nonlinear here.
I like it how it's defining the state of charge constraint.
Like your actions today, it's going to impact the next interval.
So it's done that correctly.
It's, you know, bounding the charge and discharge.
It's incorporating the cycle limit.
And then it's solving the problem.
So definitely a 10 out of 10.
One thing I found in the constraints there,
I didn't do a good job describing what charge and discharge efficiency
really means, right? Like the energy that ends up in the battery versus that was trying to get in the
battery or the or the or the or the inverse. And so there were a bunch of little examples of
whether it's defining how SOC is calculated or like what a cycle is where like I had to continually
double check like is efficiency being treated the right way here. Sort of like what the boundary
conditions of like the battery is and like what certain things are referring to each other.
Yeah, 100%.
Wow.
All right.
So we're at, you know, if you ignore ancillary services, which we've decided to do generally
at this point, we're three 10 out of tens, which is pretty amazing out of four sections.
Absolutely.
And not only that, I'm still like amazed, okay, this person is not only an optimization engineer,
but it's also a software engineer.
Right.
I want to come back to that when we talk about time and effort savings.
But let's, so what's the last section, Duncan?
Let's get through that last one first.
Yeah.
So basically we just got left with like a data frame that just has, you know, for every
interval, every time period, just what's charge, what's discharge, and what's the state
of charge associated with that, right?
So then we have to do something with all that, right?
And so it then has to calculate a bunch of metrics, you know, think about what should we be
showing the user here?
And yeah, like I said earlier, this was actually the hairiest.
part. There were all these funny little debugging issues with like, oh, you know, that function is
expecting it as a list and you imported it as a data friend, like this and that, like all these
little things and you get caught in these like loops of, you know, you give it the error code and
it like takes you down some path that doesn't work and you have to like back up. So this was actually
the most frustrating part for me. But Saeed, what would you say is how did this turn out?
Well, I didn't go through the experience that you went through, but I can tell you from the end results, it's pretty standard.
The way you're parsing data, the way you're defining your KPIs for showing.
And to be perfectly clear, I didn't ask for a fancy chart, but have you actually done that as well.
So that's like a plus to me.
At the end of the day, this part of the code and the beginning part of code is not rocket science, but it's a hairy problem.
It's time-consuming. You definitely need an FTE to deal with these things, which apparently it's being fast-tracked. So I'm extremely happy. Yeah, you needed an FTE. So let's step back then for a second. Duncan, let me ask you this question. I don't know if you tracked this, but even just rough estimate. How much time in total did this take you?
Um, that's a good question. Um, if I had to guess, wow, I really wish I tracked it. That would have been smart. Um, if I had to guess, I'd say all this to, like, from prompting to where we are right now, probably like four hours of continuous work, something like that. Okay. So we'll keep, we'll put a big error bars around that and it's not going to matter at all. So four hours of continuous work. So I had the way that you framed.
before was basically this is a task that might have taken a team of small crack team of software engineers
who need to have some front end experience, some front end experience seems like, because you're
generating some charts and visualizations, optimization engineer. So sort of multiple different
skill sets, would have taken them a year. Cut that down, though, because you simplified it. We didn't
try to get through really all the hairy stuff, and we didn't get to answer your services. So cut off,
90% of that, that's still 1.2 months of work for a small crack team of software engineers,
and Duncan did it in four hours. Is that like, am I overstating it?
Yeah, I would say, like, to generate what Duncan, again, the big caveat is you can do all of
this in Excel, because the problem has been simplified. But once you want to, like, put some
software foundation, what Duncan is presented today, I would say you need two engineers, more like
a back-end software engineer and an optimization engineer. And depending on how well they get along
with each other, because that's another piece, the emotional intelligence of working as a team,
we shouldn't ignore that. I would say best case scenario, two weeks, average a month,
worst-case scenario, a month and a half of getting to where you are today. So that's fairly
astounding, right? Not only is it significantly less time, but Duncan, not to discount your
clear, massive skill set here, you don't know how to write code. So getting to this point,
that quickly, that feels like a fairly, that's a leap in our technological capabilities. Is that
fair to state? 100%. The way I look at this is, you know, I'm not looking, when I look at this code,
I'm not looking at it, okay, am I going to use it?
Because that was the intention, but I'm not looking at it as a, am I going to use it in production tomorrow?
The challenge that I just described is fairly typical when you want to hire people.
So if this was a coding challenge for a new hire, I would say, we definitely need this person.
Because not only they know optimization engineering, they also know software engineering.
So I would look at my HR team and say, all right, if we are, if the base compensation for a software engineer, I'm just making up a number is $100, I would tell my HR team, as long as the culture fit is great, let's go up until up to 120, 1.30 to make this happen.
Because this seems to be a very talented individual.
And from there, when you work as a team, then you can do a lot of great things.
But that's how it would look at the problem.
When in reality, the software engineer costs $20 a month.
I mean, that's what I pay for GPD for.
It's $20 a month.
Oh, I see.
It's literally $20 a month.
Yeah.
That's not relative to the $100 imaginary example that Sayyodor's using.
It's relative to an actual salary.
Yeah, no, no, yeah.
Let's say if you're paying, like, I don't want to get into compensation stuff,
but let's say if you're paying $150K, I would tell my HR team, let's bump it up to $180.
This person is worth it.
Right.
And Duncan's doing it for $20 a month and four hours.
So that, I mean, obviously, like, those numbers speak for themselves.
That's supremely impressive.
Sayad, though, I guess we should talk about the limitations here, because you're sort of defining
it now here as like, okay, if this were a coding challenge for a potential higher.
But let's talk about the real question, which is, how far are we from, how far was Duncan
here from building something that you would use in a production environment, that you would
actually use to charge and discharge a battery?
Again, assuming, imagine that we had included ancillary services and the other things that you would need.
I mean, if you had insularies services, then it would have been different.
I would assume it's hard to get there.
I still don't think you can do it in four hours.
Maybe I'm wrong.
Maybe I'm, you know, we may get there.
But it depends on how you're using it in production.
When you think about market trading or, you know, processes that require automobiles.
frequency of use, real-time decision-making,
there's a lot to make a software code production great.
And this is far from reality.
So I would say, like, we are miles away from using this in production.
But if you think about an analyst just trying to make, you know,
informative decisions as, you know, souped-up calculator, it's there.
You can use it.
Yeah, I agree.
I think, honestly, where this sort of end product would be quite good is for the work
I do, which is really about analyzing projects before they exist, right? Just thinking about what a project might be capable of and what its revenues will be. Because that's all about just like sort of scripts and stuff, right? It's not about like deployed functional software. Yeah, exactly. It's very important when you're developing software to understand who your users are. Is the user requiring real-time automation, robust production, you know, real-time support, or is it mostly like an analyst at a desk? Like, if,
If the user is the latter, kudos, great job.
I will vouch for it.
One other thing I just wanted to throw out there thinking about, you know,
if this took four hours, et cetera, is one other big advantage I found is that, right,
every time you go back to, you know, chat.openAI.com and go into this chat.
It has a perfect state of where it was previously, right?
So I said four or five hours of continuous work.
But in reality, it was a bunch of 45-minute segments.
And I didn't have to, like, spin up my brain each time on, like, what was going on.
It just was exactly where we left off, which is a trait humans don't really have.
You kind of have to, like, get back into it.
Right? Software engineers talk about this all the time.
They want the five hours of uninterrupted work.
Like, do not schedule a meeting.
Like, it'll throw them out of their flow.
Chad GPT's in permaflow, right?
You can't get it out of flow.
And that's super valuable.
I like that. ChatGPT is a state of permaflow.
All right, let's wrap up by stepping back.
So this was...
Wait, should we run it?
Should we press run and get the, get the ta-da moment?
See the chart pop up, et cetera?
Let's do it.
All right. Hopefully it works.
Nothing works in real time.
This is like trying to run a product demo in front of an audience.
Grabbed our battery parameters.
Now it's getting the prices from grid status.
Still getting the prices.
All right, here we go.
It got them.
Did the dispatch optimization.
Here we go.
We have a nice little table showing for each period of time.
It decided to chunk these up, you know, how many cycles there were discharging revenue, charging costs, and then the net revenue associated with that.
There's one mistake in here, you'll notice.
Start date, February 26th, 2023, end date, February 18th, it reversed them.
And I don't know what that means exactly. If there's like deeper error or that's just a presentation
issue. You lost money that day, too, I will note. You made net revenue every single day,
except for one. That could be like just how it's chunking together these dates. You know what I mean?
Like it could like, yeah, that might not suggest suboptimal behavior.
it might just be the way it's aggregating intervals, you know.
Correct.
There's a notion in optimization that the world doesn't end in our ending 24 if you're
defining the problem at a bigger state.
So it might just be a parsing of data issue.
But that aside, right, it tells you when you're charging, how much you're charging,
tells you how many cycles you're going through.
It tells you how much you're paying to charge, how much you're making to discharge,
it tells you the net revenue that you get from all of that.
And I played around with this a bit and you'll find if you, you know, if you look at the
the first date and the last date and how many days are implied in that, and then how many
cycles total, you'll see it did in fact follow our, we imposed a 300 annual cycle limit on it.
And it's sort of like pro rata dealt with that and, you know, maximized cycle usage.
What would be kind of interesting would be to rather than cycles think about throughput.
So how much, how many megawatt hours is it charging and discharging and dividing that by the net revenue is generated to kind of come to a like, what's the average value of a megawatt hour through this battery?
That could be kind of cool.
But then, yeah, it also then, you know, post a nice little chart that shows you.
I had it just pick the day with the greatest net revenue production and then the day before and the day after.
so you could kind of see the context of what's happening.
And you see in red what the LMPs are,
and then in blue what the state of charges reacting to those.
And so you could see in this, you know, in the beginning,
it charges when prices are, you know, in a valley.
Rather than discharging during this local peak,
it sits tight and waits for a later peak,
discharges a bit and then sits tight
for the peak to pop back up a little bit and fully empties itself.
So presumably this was a good strategy for these three days.
I'm sending a note right now, please hire this person ASAP.
Yeah, it actually looks really clever.
I would have thought that, like, knowing that this is charging and discharging based on energy prices in Kaiso,
I would have thought it's like a, you know, it's an inverse duck curve, basically.
I would have thought you're just obviously like charging when solar generation is high,
discharging the solar generation is low.
And it's kind of like that, but it's a little more complicated because it's, like you said,
these multiple peaks, super interesting.
Yeah, it gets into like the whole opportunity cost of discharging, right?
You might want to wait, right?
And you might not have a good chance to recharge at low prices if you go after the local peak
rather than some peak, you know, a few hours later.
Amazing.
Well done, Duncan.
This is, I mean, it's impressive.
done.
So, okay, so let's step back then.
You know, this was, we ran this experiment in part to just sort of like test out a microcosm of a bigger question, I think, which is the question everyone in every industry is asking, which is like what, how big a deal are these large language models and the broader sort of generative AI world in my sector?
And I think there's sort of two versions of that. One is like, how big a deal is it today? But then there's also the art.
of the possible, as Zayad said before, we're whatever, like nine months into this, and we've
already gotten this far. So given this experience, I guess I'll ask Duncan first and then
Syed, like, did you come away thinking, holy shit, this is going to change everything? Or did you
come away thinking, okay, like there's clear application here, but no fundamental capabilities
are going to be introduced by this type of thing? I guess, yeah, my impression,
from this experience.
And I've kind of been playing with it with other stuff too.
I built a similar EV fleet managed charging optimization model using chat GPT.
And like it's all stuff that I know someone who's good at this could do.
But I don't have enough of those people on my team.
I don't have enough resources to just like look into something kind of quickly.
And so for me, the takeaway was kind of,
it's as if you got access to a team of like five good young engineers that are fast.
I don't think it's going to like solve new novel problems for you that like no one's ever
really figured out before.
But it's just like a great resource basically.
And I think for a lot of people that can be a big multiplier, particularly people who are like
very quantitative and systems thinkers but not programmers.
right basically me but there's a lot of people like that right and i think for them it's very powerful
because you can you can describe how something should work and the actual math behind it and you know
the concepts and you know what linear programming is you know you went to engineering school right
but you're not a software engineer that for that person i think it's pretty powerful and it turns
out there's a lot of them in the energy industry the energy industry is full of people who are like
great at excel and really good at systems thinking but are not programming
So I think you could see a lot of people empowered by this.
Say Ed, what was your takeaway?
Besides wanting to hire this person.
Yeah, besides wanting to hire, I agree with you, Duncan.
But if I want to step back 50,000 feet, you know, sometimes as a society, we have a hypothesis
and we want to push it saying that this hypothesis is going to be disruptive.
I mean, I think blockchain was kind of like that.
We thought blockchain is going to be in our day-to-day lives.
and a lot of capital went into it with the hope of this is going to change our lives.
And on the flip side, you see major disruptions like the internet.
I mean, I think we're all pretty young back in those days, but the hypothesis wasn't so much that this is going to change your life.
It actually did, and then we capitalized on it.
I think large language models are the same.
A couple of months ago, we didn't even know what chat GPT was.
And now we're using it in a lot of aspects of our day-to-day lives.
I mean, we're just talking about dispatch optimization.
I use it for many, many different things.
And this is a major disruption.
And I think there's an opportunity for us as an industry, particularly in the energy industry, to embrace it and use it.
It's not here to replace us.
It's here to make our lives easier.
And we need to pivot away from what we're doing yesterday and embrace this technology for change and better value propositions.
And I think we are just scratching the surface.
So overall, I'm super excited about this direction.
I would definitely agree with that, too.
I don't see this replacing software engineers.
Just as it empowered me, I very much see a great software engineer just using this to save a ton of time, right?
Telling it exactly what to do with a level of specificity about programming that I could never muster,
because I don't know anything about that.
But just basically one good software engineer becoming like a team, basically, of many.
but via one person.
I think, I guess my takeaway is like,
not dissimilar from you guys,
but I think today what it clearly is,
at least in this context, right,
is massive efficiency improvement,
but not fundamentally allowing novel applications
or products or anything like that, right?
But wildly more efficient way to develop something.
but as I had said, we're scratching the surface, right?
And is that still going to be true when millions of people have put a lot of really smart
thinking into how to develop these capabilities years from now?
Is it possible that it will be able to do things that we can't do today or, you know,
generate new capabilities?
I think probably it will.
So what's cool about it to Syed's point is like there is an application already today.
There's no question, right?
We've proven it now, but we're not.
the first ones to prove it. There are many applications already today and lots of other
components of lots of businesses. But I will say going into this, I mean, we didn't, we truly
did not know, is this going to work? Right. Like, Syed set, what I will say my reaction when
Syed sent the prompt was like, okay, like, I want this to work because I want there to be something
to talk about. And I thought you're just going to end up with like a really buggy set of code that
like didn't actually run and, you know, didn't quite do it.
We were going to have to kind of parse through it to figure out what it was able to do
and what it wasn't.
What you were able to build, uh, worked.
It had limitations, obviously.
And I don't think a layperson could have done it, right?
Duncan, you are, you know much more than the average person about batteries, about
power markets, right?
We purposefully did not select like my mother to build this.
nonetheless, extraordinarily impressive to me.
So I think this is awesome.
I really appreciate you guys both doing it.
I think we should come back in a year and try again with something harder.
Because I think it'll be really interesting to see what type of a challenge we could try to tackle one year from now.
You guys up for it?
Yeah, 100%.
But I think in a year, an instance of chat GPT will be representing me in your podcast.
It's just going to be evolve so much.
But yeah, definitely up for it.
One thing I think is like, I think it's worth thinking about that, right?
We used a large language model to write code on something, right?
In this case, a very sort of like mathematical type problem.
And like that's one thing large language models can do.
But there are also other, like it's not just a script writer, right?
There are things that may be like uniquely capable of doing with regard to language, right?
And so I want to lead or maybe close with what I think is potentially the most compelling energy use case for large language models.
I'd love to hear what you guys think about what might be compelling as well.
But a persistent issue in the electricity industry for as long as I've been doing this job is utility tariffs.
Every utility has some 700-page document that outlines massively complex nuances of how you,
users will pay for electricity. And we're talking about electricity becoming like the fundamental
commodity on Earth. These documents are incredibly valuable and none of them are similar. Zero
similarities. So you have like whole companies that have humans read them and like turn them into
machine rateable databases and stuff. Very interesting to consider throwing an LLM just at every
utility tariff and actually parsing it into an understandable resource. And you know,
know, every utility calls a demand charge something slightly different. It in theory could learn that
and then just figure out what demand charges should be called and make the big database of them,
right? This is like, to me, one of those kind of like grail problems for the distributed
energy world, at least, that this seems actually really well suited to do. And I've heard a bunch of
people sort of like talking about it. So just putting the bat signal out there. Anyone who wants
to do this for the good of humanity, please talk to me. I think, I think our,
mutual friend Kieran from Arcadia is probably the right one to talk to about this,
or one of the right ones to talk to about this.
But 100%.
Utility tariff documents.
I mean, you can even go broader, right?
Like, also true of all these, like, public utility commission documents and, like, everything,
everything that happens between utilities and regulators is also in, like, 100-page
PDFs that don't look similar to each other.
And there's all sorts of interesting things that you could do if you just had access
to all that data in one place.
bought on. If it requires reading and comprehension and analysis, for example, we're using it for
energy procurement. I mean, there's a lot of bespoke processes that get involved there and a lot of
documentation. So it's all about how to embrace this opportunity and how to be agile to make
sure that this technology is put in good use. And I think we're all solving the same problems.
All right. Save the date. When you're from today, the chat GPT formerly known as Syed,
will join us to come up with a new challenge.
Thank you so much, guys.
This was awesome.
Thank you.
Thanks for having us.
Cool.
Thanks, Duncan.
This is fun.
Thanks, Open AI, I guess.
Yeah.
Thanks, Sam Altman.
Duncan Campbell is a vice president at Scale Microgrid Solutions.
Syed Madeni is the CEO and co-founder of first.
Well, if you're interested enough and wonky enough, you can actually see the code that Duncan created.
look in the show notes if you're on Spotify or if you go to canarymedia.com. Otherwise,
you can run the code yourself, play around with it. We'd love to hear what you think of it.
This show is a co-production of PostCrip Media and Canary Media. PostScript is always supported by
Prelude Ventures, a venture capital firm that partners with entrepreneurs to address climate change
across a range of sectors, including advanced energy, food and ag, transportation and logistics,
advanced materials in manufacturing, and advanced computing. This episode was produced by Daniel Waldorf,
mixing by Roy Campanella and Sean Marquand, theme song by Sean Marquand.
I'm Shale Khan, and this is Catalyst.
