Orchestrate all the Things - Good enough weather prediction at scale as a big data, internet of things and data science problem. Featuring DTN CTO Lars Ewe
Episode Date: June 30, 2022What's a good enough weather prediction? That's a question most people probably don't give much thought to, as the answer seems obvious -- an accurate one. But then again, most people are not CTO...s at DTN. Lars Ewe is, and his answer may be different than most people's. With 180 meteorologists on staff providing weather predictions worldwide, DTN​ is the largest weather company you've probably never heard of. Weather forecast, too, is all about data and models these days. But balancing accuracy and viability is a fine act, especially at global scale and when the stakes are high. Article published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All The Things podcast. I'm George Amadiotis and we'll be connecting the dots together.
What's a good enough weather prediction? That's a question most people probably don't give much thought to, as the answer seems obvious, an accurate one.
But then again, most people are not CTOs at DTN. Lars Säver is, and his answer may be different than most people's. With 100 meteorologists on staff providing weather predictions worldwide,
DTN is the largest weather company you've probably never heard of.
I hope you will enjoy the podcast.
If you like my work, you can follow Link Data Orchestration on Twitter, LinkedIn, and Facebook.
Thank you for having me, George.
So my name is Lars Ewa. I'm the CTO over at DTN.
I joined the company about two and a half years ago.
Prior to that, I was running product and engineering over at Anaconda,
which you're probably familiar with,
which was also a very interesting job or opportunity.
And prior to that, I was over at Click Security Security, which actually no eVariant was before
that, I'm sorry. eVariant, which is a healthcare company, was acquired by Healthgrades. In short,
a lot of these jobs, what they have in common is I personally feel very strongly about data,
data science, and just the ability to provide insights and help companies run their
business more operationally efficiently and effectively by leveraging state-of-the-art
technology to actually come and produce better outcomes.
So DTN is, in that regard, sort of a logical evolution to my career.
And I'm very excited to be here today and then talk to you a little bit about what we
do here at PTN.
Great.
Thank you for the introduction.
And well, to also share a few background on my side, a bit of background on my side as
to, well, the topic and what triggered, let's say, the conversation today.
I was pitched a story by PR that had to do about the way that DTN engages in weather prediction.
And I have to say that, you know, initially, I have to admit that, first of all, I wasn't
familiar with DTN at all.
But then, you know, just looking around a little bit about what it does,
I was somehow impressed by the fact that DTN is involved in weather prediction, because previously
I was under the impression that this is a task reserved for, you know, state or government
agencies, basically. Even though, you know, looking a little bit at what DTN does in
general, and I'm sure you can enlighten us a little bit more on that, it started making more
sense, basically, because it seems that having weather prediction, reliable weather prediction
is something that can be useful in those other activities that DTN is involved in.
So that's the next thing I wanted to ask you, basically. So in what ways does DTN do weather prediction and why it is useful and for whom?
Absolutely. And if I may, maybe start the answer by a very quick high level statement on DTN.
As you said, not everyone's familiar with the company. So we are a global technology, data, and analytics company, and mostly active in
agriculture, energy, and other weather-sensitive industries. And our main goal is to provide
operational intelligence or actionable near real-time insights, if you want to call it that
way, to our customers to help them better operate their business. So that said, you know, weather is a key ingredient in some of the data that we need in order
to provide actionable insights.
And you're right, there's obviously quite a few entities out there, both government
and private, that are producing various weather forecasts and weather forecasting technologies. We are actually, depending on who you ask
and how you count, one of the largest,
not the largest privately owned weather company in the world.
We have more than 180 meteorologists on staff.
We have hundreds of engineers,
not just dedicated to weather forecasting,
but to the insights of the operational intelligence products
that I
mentioned. And so we have a significant investment in weather and weather forecasting technologies.
Now, you ask why? At the end of the day, we need global, consistent, and reliable weather
forecasting technology. And so a lot of the agencies, whether it's government agencies or
other sources of forecasting data out there
are either not global or they have weaknesses in certain areas. They're also usually very
restrictive in the resolution. So another thing that plays into the discussion is the quality
of resolution in certain areas, right? And so what we have done is we are leveraging pretty
much all of the above, you above, all of these publicly available
and sometimes through license proprietary data inputs.
We're using all of those.
We also augment that with our own data inputs.
We have owned thousands and operate thousands of weather stations worldwide.
But most importantly, we have created a state-of-the-art ensemble engine that takes all of these various
data sources and tries to through the means of machine learning and other technologies
tries to improve overall predictive models globally with different and varying resolution
using things like mpass but also you know by having various regional models that we run and adding them to the ensemble
so we can actually have the right data and the right resolution,
the right quality for the right areas where we operate in
and where we provide insights, if that makes sense, George.
Okay, I see.
So first, before we get into the more technical details of the infrastructure that you use
and the methods and the data that
this infrastructure creates and how you integrate them all and how you use them to generate those
predictions, I wanted to ask you if you can drill down a little bit on the specific products that
DTN has around weather. If I'm not mistaken, I think I saw that
the weather prediction
is offered as a service in itself,
but I think it also is used
to power other services, right?
That is absolutely right.
And in fact,
the weather predictive services
and APIs that we provide
to our customers
provide in many cases
the core foundation
of what we do.
We do more than that, by the way.
But really, where it gets even more exciting in our minds is the actual actionable insights,
the operational intelligence we provide on top of that.
So examples would be, we have a product called Storm Impact Analysis that actually helps
utilities better predict outages based on
storm systems passing through. And by doing so, allowing them upfront for planning better,
staffing planning around that, meaning repairs and also, you know, avoiding some of the costs,
the regulatory costs and penalties that can come with outages about having better understanding and insight into so a lot of this or maybe give you one more example a lot of shipping would be
another example where we actually help some of the largest shipping companies in the world
guide and and compute best routes for their ships both from a safety perspective but also from a
fuel efficiency perspective, right? So
oftentimes there's more than one benefit to some of the solutions that we provide. And aviation
would be another example, by the way. But at the heart of all of this is really the idea of
taking weather forecast technology and data that we have available and then merging it with
customer-specific data that is relevant to their operations,
whether that's the location of ships or the hull shape of the ship, things of that nature,
or whether it is outage information and the health of the infrastructure grid, the electric
grid, for example, to then merge the data through machine learning models to help them
provide these insights that are usually probabilistic in nature,
which is also an interesting challenge by itself, as you can imagine.
Okay, I see. So in the scenario where weather prediction is offered as a standalone service, let's say, then I guess no further customization is needed, except for the exact location that the client is interested in receiving weather prediction for.
But in the scenario where that is used to power other services, such as the ones you mentioned,
then I guess it's tailored specifically to the needs and most importantly, the data of the customer.
So it's not one common, let's say, service for everyone.
It has to be tailored per customer.
That is correct.
At least that's the highest level quality service we can provide.
The more we can work with customers and integrate their data through standard common APIs that
we provide, integration APIs and things of that nature, the higher the quality of service.
The less data they provide, we can still provide more generic services in these areas.
But yes, you're absolutely right.
The quality of service of data that goes into the models and informs the models certainly
also helps with the quality of the outcomes of the models.
Okay.
Okay.
And as far as regards the comparison, let's say, of the kind of weather prediction that you do
compared to standard, let's say, weather predictions offered by government service, let's say,
where would you identify the main differences? Is it in granularity? Is it in time frame or in scope or in something else?
It really varies and depends, right? As you know, they all vary. And even the comparison
baseline that you gave me, they're various in their own scopes and precision and so on. But
yes, we obviously, as you would imagine, try to provide the right level of precision for the right areas that we operate in and also for the use cases.
So that's one thing.
We also, by creating an ensemble technology and we have the ability to actually weigh the various models and through machine learning over time, you know, use the right weight factor,
the models that are more accurate in certain scenarios,
given the scenario that you're trying to solve for.
So really it's a combination of,
in some cases we run very specific wharf models
with high resolution in certain areas close to,
say harbors where we need to have very precise weather information
to safely guide ships in that area.
In some cases, we actually use, you know, available data, but through Ensembl technology,
we've tried to compute and rely on the right or weigh the better models over time that are more precise for the scenario and the time frames, right? We also are looking at decades and decades worth of historical data that we have access
to that obviously informs both our engine and our models, as you can imagine.
Okay.
So let's then come to the actual infrastructure that you use.
You know, disclaimer here, I'm by no means an expert in meteorology or weather prediction,
but I have interviewed a couple of people in the past who are. So through them, I learned a couple
of things. And as far as I can tell, it seems like satellite imagery is a very important data source
in terms of weather prediction. But I think it's not the only one, right? So there is
also input from sensors that gives you metrics such as humidity or rainfall or, you know, other
temperature, obviously, and other types of measurements that can be useful for the models.
And well, potentially other sources that are missing as well. So I'm guessing you probably use these two.
So does the company own its own satellite fleet
or do you get imagery from other people's satellites
and what other kind of data sources do you use?
Yeah, no, absolutely.
You're right.
There's many different observational data points that we use in forming our models and the forecast, therefore.
And, you know, maybe not appropriate to go into all of our exec partnerships and licensed deals that we have, but we do.
We use data from various different sources, whether it is satellite data and imagery data, whether that is radar, whether that is weather stations. We even use osteogramic scatterometers,
you know, weather balloons, measurements coming from airplanes. There's a large array of different
observations that are fed into the engine. And as you can imagine, just by the sheer vast number of data,
we're talking big data here.
So in many regards, weather forecast at this age is really a big data problem.
And it's also, to some extent, an Internet of Things data integration problem
where you're trying to get access to all of this data
and intelligently and smartly integrate the data and store the
data for further processing.
Yes, indeed.
And I imagine this is probably one of the most demanding tasks you have in your role
because of what you just mentioned, basically.
So the sheer, not just the quantity and so the volume of the data, but also probably the
speed at which this data is coming in.
So you need to consume it really, really quickly and also homogenize it and integrate lots
of different data from different data sources in one common repository so that it can be
useful for your models, right? And I'm guessing this is
probably the part where AWS comes in because again in the initial storyline that was given,
it was sort of framed around how you work with AWS to achieve that.
Yeah, no, absolutely George. And maybe if I may, before we go into the AWS details, I think a little bit about Lars, but more importantly, about this task.
It'd be interesting maybe for your audience to know, I myself am not a meteorologist either.
I learned enough over the last two and a half years that I can be dangerous enough to have this conversation with you.
We have, as I said, world-class meteorologists on staff and scientists.
That's not me. But in order to build these kinds of solutions, many different skills have come
together, right? Whether it's the meteorologists and the science that we're talking about,
whether it's the big data processing aspects, whether it's machine learning and data science
know-how, DevOps.
I mean, there's so many different things that all have to work hand in hand in order to
provide solution.
And that is really where I see my background.
That's also where I see the company positioned, right?
We, you know, we excel at what we do by providing and having on staff talent in all of these
areas by combining
that know-how to build these solutions, if that makes sense.
That really is something I wanted to mention.
So back to your original question around AWS and how that plays in, there's a case study
that is available if you just Google for AWS and DTN weather forecasting, there's a case
study that goes into great detail
for your audience if they're interested.
But make a long story short, in summer of 2020,
I believe, if memory serves me right, round about then,
we actually decided to work very closely.
We've always been an AWS shop.
So a lot of our cloud compute happens in the AWS cloud.
But we wanted to approach them because we came to realize that a lot of the high performance compute capabilities
that we need for these kinds of tasks weren't really served the best way and also not the most
cost-effective way by the Amazon instances available at the time. And so by working very
closely with them, we actually have, you know,
I mean, obviously that most of the expertise or good portion of the expertise comes from Amazon.
So I don't want to misportray this and say DTN invented this, that would not be fair,
but I will say DTN was a partner in developing this with Amazon. They developed the EC2 HPC6A
instance type, which is really a high performance compute node that is much more cost-effective,
higher throughput, but also more importantly,
we work that in conjunction with parallel cluster,
which you're probably familiar with,
as well as elastic fabric adapter,
and the EFA protocol, FSx for Lustre.
So there's various technologies that have to come together
to actually allow for us to have parallel compute in a cloud environment with the interchange between the nodes, fast access to the underlying storage, the data storage, both read and write, to actually be able to get these throughputs.
And in fact, we were able to more than double our capability of running these models from a capacity and timing perspective.
And that leads us to the other aspect, right? Historically, weather forecasting, or not just
historically, it's still an extremely compute-intensive task, as you can imagine. And so
not all that long ago, a lot of weather forecast systems were limited to, you know, a very,
you know, fixed schedule, say two times a day,
it would compute forecasts for two times a day, not four,
two times a day, they would update the forecast and recompute.
Well, that leaves, that leaves a whole lot of, of gaps.
And so we are now able to,
thanks to all of these modifications to do a couple of things.
One is we're no longer scheduled on a fixed schedule for our forecast engine
computes. We're now data-driven, which is event-driven,
which is really the goal, right?
The ultimate goal is to recompute when you have significant changes in the
underlying model and the underlying parameters and data that justify a
recompute. So we're now at that point where we can do that event-driven,
which is very exciting, but more importantly, we can do so in,
in an hour. And in many cases,
we're starting to now see under one hour of,
of update cycles in average, right? Again, it's data driven. That's,
that's, that's massive. You know,
the last number that I've jot down here, cause I,
even I can't remember all the data always, is that at current state, we can create a one-hour global forecast for the entire global grid forecast
in under one minute with the latest AWS instances and the technologies that we have trying to develop.
That is remarkable.
And I think it gives you hopefully a good sense for the innovation that's taking place in that space.
Yeah, it is.
It is remarkable indeed.
So, yeah, and thanks for sharing some of the background.
So if I got it right, basically you sort of worked together with AWS to co-develop new
instances to better serve your needs.
And I've seen it also happen in other use cases where
basically when you have sort of edge cases, let's say, that are driving innovation in terms of
whether it's data throughput or computational leads. And I guess in your case, it was probably
both. So you sort of worked together with other people on Amazon to develop new instances, right,
to better serve your needs.
Yeah, I want to be very clear about the rules.
Obviously, when it comes to compute infrastructure,
no one is going to tell Amazon how to do that.
That's where their expertise lies.
But we certainly brought to the table
and to the partnership our expertise
around the models and around the data
and around the requirements that we had,
both from a just compute know, just compute perspective,
but also how to manage that kind of setup, uh, et cetera. And, uh,
and so a wonderful partnership that was highlighted, uh, at their main event.
Uh, but again, you know, if you, uh, if anyone wants more detail,
easiest is just Google up the it's easy to find.
Okay. And so the other thing I was wondering um is i guess you you
you're this is probably something you um you get asked by by uh executives in your own company a
lot is whether uh this all still makes sense financially in a way because yes i mean i get
how it's very uh important to be adriven and event-driven and all of that.
But actually recalculating everything on one hour per average interval, as you mentioned, it also incurs a significant computational and therefore financial cost for you.
Does it still make sense to do that?
Yeah, no, it's a fantastic question. It's
the side of the CTO job that isn't always the most enjoyable, right? We all enjoy the technology,
we all enjoy innovation and producing new cool things, but you're absolutely spot on. We also
have to make sure that the P&L makes sense and that there's a commercial component here that satisfies the needs and and it does so again just to be clear there's complexity here that's
probably past uh this podcast but uh we recompute um you know intelligently based on what needs to
recomputation right um we also have again varying degrees degrees of precision on the grid based on regions in the world,
customer base and requirements that we have, right?
So I want to be very clear,
because of what you just said,
there is intelligence built into the solution
and thoughtfulness as to the cost that occurs
when recomputes happen.
So very good point.
Again, part of the relationship with Amazon
was not just a better
compute power and all the things we mentioned, but also to bring the costs down to make it more
cost effective, which plays into that. The other thing that's really interesting for what it's
worth is as many companies out there, we also had, and speaking about expertise, we had more than
five, depending on how you count forecast engines in the company through M&A,
right? Through acquisitions of other companies,
you bring in talent and you bring in technologies. And so, you know,
that was wonderful because that allowed us,
that gave us a lot of IP and a lot of know-how, right?
And we had learned through these previous exercises, what works,
what didn't work, the pros and the cons.
But by consolidating all of that and building one, truly one global forecast engine, brought costs on significantly.
So there's cost savings also just through integration and building and reducing some of the redundant runs that you had before across the business.
Last but not least, I should mention that one of the things we take great pride in,
this engine is not just for atmospheric, it's also for also organic graphics.
So really both marine and atmospheric is covered.
And a lot of innovation is taking place as we speak and interplay between those two environments.
That is still, you know know we're still better learning and
understanding that as scientists and therefore as engines we're still learning how to better
model those interdependencies. Great actually that's precisely what I wanted to ask you next
about how exactly does this engine does this model work in itself? Again, based on my previous exposure to the topic,
I've kind of gathered that nowadays the way people do it
is they sort of take a hybrid approach.
So yes, learning from historical data
and lots of machine learning, as you pointed out,
but it turns out that domain knowledge
is also very important.
It gives you some, well, it gives you lots of nice properties to have in a system like that,
like explainability and the option to integrate knowledge
that is formal, let's say, in equations and this type of thing
that is the formal side of meteorology that has been developed over the years.
And I'm kind of assuming that this is the same,
it's the same approach that you're taking.
Yes. And you are not talking to the right person
that comes to the deep scientific background on this.
That's not me, as I said earlier.
We have some of the industry leading experts on staff doing exactly that.
But you're right. It's very much a hybrid, right?
There's a lot of different aspects that flow into the solution.
As I said, there is a significant staffing proportion behind this with various varying different backgrounds. What's interesting is I think we have seen with the ability of us as an industry to take on more and more data, to compute and parse data more effectively at larger scale.
We have seen the ability to further and further improve our forecast capabilities because of that. there's another challenge here that again i'm not the best person to talk to all the exact details
but um you know the nature of weather forecasting historically has been deterministic but often with
failure right i'm trying to tell you that it's going to be this temperature this humidity and
these wind parameters and whatever else at a certain time in a certain location and as we all
know sometimes it hits it on the spot, sometimes not quite so much.
So the trick becomes more and more
to make that actually a probabilistic forecast
because that's really what matters
from an end user perspective.
One of the big aspects
that I always stress
with these kinds of discussions
is whether in and of itself,
while interesting, intriguing,
isn't really the domain
of probably trying to solve for, right?
They're really trying to make decisions upon that, right?
So given certain weather conditions, should I or should I not evacuate this offshore drilling facility?
Should I or should I not prevent this sports event to take place?
Should I or should I not reroute my ship or my airplane?
And as you can imagine,
the probabilistic nature is what comes to mind, right?
It'd be nice to know how probabilistic certain outcomes are
so you can further factor that into your risk equation,
so to speak.
That's another area where we're doing a lot of research
right now in this ensemble technology by,
and again, I'm not the expert to be honest here,
but by providing a different way of how these different forecast parameters
and also the different models that we have,
how they actually all get ensembled together.
And by doing so, providing a more probabilistic outcome reporting mechanism
is something we're working hard at.
And hopefully we'll have more to report over time. Yeah and you know if I may add from the consumer let's say
point of view I've noticed that over the last few years it seems that the weather forecast that you
get as a consumer again they also incorporate the probabilistic element as well.
So they don't anymore tell you like, okay, tomorrow is going to be a thunderstorm,
for example.
They get a more probabilistic prediction, like there's a heavy probability
of a thunderstorm, something like that.
So I guess the broader audience is also getting more acquainted
with this type of prediction. And you already kind of
mentioned some of the things you're working on, let's say, for the immediate future.
And hearing you mention things like ensemble models and integrating different engines,
it's also sort of hinted to one of the closing questions I had for
you, which was, I imagine that in order for those models to be, well, good enough, basically, they
have to be quite specific in terms of the region that they're modeling. So you probably have a different model, let's say, for Africa
or Europe and even specific areas within each continent. So I wonder if those models are
somehow integrated with each other and whether there is a sort of meta model, let's say,
that you have to potentially have to build in order to have better granularity
in the overall prediction.
And if this is something that you're working on, and to give it a little bit more grounding,
it sort of made me think of the digital twin approach and something that I've seen NVIDIA,
not just me, but actually something that NVIDIA has published recently
with the so-called EarthU.
So they're basically building a model of the entire Earth.
And I wonder if this is a direction
that you are taking as well.
We absolutely are interested in learning
and also are dabbling in some of these areas as well.
As I said, there's work being done at DTN,
active work to better understand marine
and atmospheric interactions,
which is the same principle, right?
How do some of these systems interact with one another?
You're absolutely 100% right.
We run different models in different regions
with different resolutions.
We also, as part of this, making more and more use of MPAS,
which if you're not familiar is a model
that is actually variable,
where you can have a global model,
but you can configure it with different variable precision
in different regions.
So if you're in the middle of the desert
with little to no population
and little to no commercial relevance,
potentially you might have a higher resolution,
a lower resolution there, whereas, you know, you get the picture.
So we're doing that.
And again, the trick with Ensemble is to actually try to learn
from all of these models and pick and choose,
in an automated fashion, the right models
and with the right weights and the right circumstances.
And there's a feedback loop where we're constantly, that's's self-training right it's constantly evolving uh to provide
better outcome but you said another word that i have to i have to comment on because it's an
internal discussion that makes me laugh a little uh you said good enough and uh the reason i laugh
a little george is because that's as you can probably imagine that's sort of the ongoing
dialogue between the the business side as you can probably imagine, that's sort of the ongoing dialogue between the business side,
as you mentioned earlier,
there's a business side to this
and the scientific side.
And one thing to always keep in mind here
is that there's a lot of downstream consumption
where the real true value lies in our minds, right?
Again, should you or should you not
take certain action as a business
as a result of this?
And it so turns out that oftentimes you want to be very careful in how you balance your investment levels, so to speak,
because the weather is just the input parameter for the next downstream model.
Right. And sometimes that extra half degree of decision may or may not even make a difference for the next model down.
Right. And sometimes it does. And so that's, you know,
that's why this whole discussion is a non-trivial discussion,
to be honest with you, because it's, there's a lot of discussion that needs to
take place in the configuration of these systems and the evolution of the
systems relative to the use cases you're trying to satisfy and the actual
business problems you're trying to solve, which again, isn't so much just the weather forecast,
but is what does that weather forecast mean
to your operations as a business licensing
our solutions and our technology?
Yeah, yeah.
I'm sure that's a very fine line to walk
because, well, obviously, you know,
scientists and the people who build those models would be happy working to even gain marginal improvements.
But that doesn't always make sense in the broader, in the bigger picture, let's say.
So I guess this is where people like you come in and sort of try to balance things. Yeah, and to give our staff some credit,
I think the dialogue that we had
over the last two and a half years
around this topic has been phenomenal.
And I think more and more,
do I even see the science teams
and the engineering teams appreciate
the commercial aspect of this, right?
And in fact, I think it wasn't that long ago
where I was in a meeting and someone said
but wait wait wait is what about good enough shouldn't that be good enough and i as you can
imagine i had to laugh a little i was like yes now we're now we're all uh getting on the same
page so it is it's a it's a very key ingredient for any business doesn't matter whether it's
whether or anything else is to make sure that it's financially and commercially viable uh and
then you're actually produce something
where you don't over optimize on one component or sub component.
That's true for, I guess that's true for pretty much anything
we do as a society, but certainly applies here as well.
And yes, it's well, everybody has their role to play.
And I guess yours is kind of, again, looking at the big picture and stepping in when you have to.
And George, if I may, one thing that just came to mind that I'd like to add, if I may, is with respect to digital twins, I think they're actually extremely relevant in many different ways.
They're relevant in the way that you just uh alluded to but we're also looking into
and investing heavily in digital twin technology around say for example ships and and and how a
digital twin of a ship behaves uh under certain weather conditions wave patterns etc right so
that we can better compute and and predict the the the route but also the the timing right um
you know part of what is equally important with shipping, for example,
is to know exactly how much delay or when the ship can arrive
at a certain location.
Well, that has to do, is dependent on the type of ship,
the hull shape, the quality it is in, meaning, you know,
maintenance aspects and things of that nature.
So digital twins play into many of these solutions
in many different ways, if that makes sense.
Yeah, it makes sense that, you know,
since you have a variety of services
that you offer for a variety of scenarios,
then it would make sense to model some key concepts in that,
like chips or planes or, I don't know, ports and so on.
So, yes, it does make sense.
Great. Thanks.
Thanks a lot for your time and for the conversation.
It seems like you're doing some interesting work over there.
And, yeah, I guess one of the key things that comes out of this conversation
is that, well, weather really is a sort of substrate service, let's say, that is consumed by a lot of other of your services.
And by extension, I guess that sort of mirrors how things work in the real world.
So you have to know what the environment is like to be able to plan and respond accordingly.
I hope you enjoyed the podcast. If you like my work, you can follow Link Data
Orchestration on Twitter, LinkedIn and Facebook.