Orchestrate all the Things - Rendered.ai unveils Platform as a Service for creating synthetic data to train AI models. Featuring CEO / Founder Nathan Kundtz
Episode Date: February 3, 2022As more organizations are turning to synthetic data to feed their data-hungry machine learning algorithms, Rendered.ai wants to help. Article published on ZDNet ...
Transcript
Discussion (0)
As more organizations are turning to synthetic data to feed their data-hungry machine learning algorithms,
Rendered AI wants to help.
So I'm a physicist by training. I did my PhD work at Duke University
and then took that work to Intellectual Ventures, where I actually incubated my first company, Chimeta.
So if you look up Chimeta, we did satellite communications, flat panel satellite communications technology.
And there I built really the world's first flat panel tech for tracking satellites with no moving parts that was widely commercially deployed.
Built those products, delivered them all over the planet, built a communications network with
Intelsat to support them.
That company reached sort of a stage where it made sense to hand it over to a growth
management team.
Didn't need a physicist at the helm anymore.
I was interested in doing some other stuff.
So I went and did some venture work and then really started hearing more and more from the people I knew in the satellite industry about the challenges that they were having with data.
So I dug into that and really tried to understand, you know, what were the limitations?
What were the ways that they could be addressed? And where did I see that going?
And ended up writing a white paper about it, which I shared with some of my friends in the industry.
A few of those ultimately said, yeah, let's build a company to do that.
And as we keyed in on this concept of synthetic data.
And so we we started off really trying to build some tools that could help the people that we knew in the satellite industry, particularly in the remote sensing now worlds where we're taking imagery of cities being built and patterns of life, crops, forestry,
et cetera, from space and trying to get a better understanding of them.
So we founded the company in 2019 and have been off to the races since then.
Okay, great.
Thanks for the introduction.
And in a way, it makes sense.
I mean, even though I'm by no means an expert in synthetic data,
my understanding is that visual data, like photography, for example,
are kind of typical.
It's a typical use of data in that domain. And I guess we can come back to
that later because I think if my understanding is correct, having read as much as I've been able to
about what you do, I think that you don't just do visual data at this point, but we'll return to
that. Yeah. Yeah. That's right. We don't just do visual and I'm happy to touch on sort of the metrics around what it takes to actually collect that data.
So why don't I just for two seconds, like we're on the topic, I'll comment.
Sure.
So we definitely go beyond visual and visual can mean a few different things.
So visual can be sort of RGB camera, kind of what you think of, like I've got my camera on my phone.
But there's other things that we translate to visual that we also do things like X-ray.
Right. Not a visual sensor in the sense that you're not looking at the visual light, but translated individual.
And then we also do radar, synthetic temperature radar, microscopy, and a lot of other different sensing modalities that ultimately are translated often using computer vision tools.
The platform can be used for completely non-visual, for tabular data or for audio data, and also for video data.
So we've built it to be fairly agnostic. I will say though, a lot of the tools
we have kind of supporting on the platform are targeting those computer vision use cases first.
But computer vision means a lot more to us than RGB photogrammetry.
Okay. Yeah. Thanks for the clarification. And yeah, we can return to that a bit later with more details.
For the moment, I wanted to follow up on some key facts that you mentioned about the company.
So you said that you started in 2019 with a few co-founders.
How many, by the way?
There were four of us altogether.
Okay.
So yeah, I was wondering if you could share
like you know some key facts about where you are today. I think you have for example I think
I saw you have received round A funding which if my memory serves me right was six million.
Yeah okay it was seed funding sorry yeah. Seed funding of 6 million to date and with some really terrific firms.
So I don't need to list them here unless you want me to.
It's fine.
It's fine.
I mean, we can find that information offline as well.
I was more wondering about things such as your headcount or use cases or customers or there's always, or at least
sometimes there's some confidentiality clauses around disclosing names and so on.
But if you could mention the interests that you're active in, if nothing else.
Yeah, absolutely.
So there's about 20 of us in total now.
That's a mix of contractors and full-time employees
the in terms of customers the the the one that we are able to speak the most about is the partnership
with orbital insight and the work we've done with nga and so orbital insights really the premier
firm for doing object detection from space there's a number that do it, they were sort of earliest and remain best in class. And so we've been working with them and demonstrating some really significant
performance improvements and detection algorithms, particularly when the things that they're trying
to find are rare. And just a couple kind of metrics on that that can help give some context to to make images relevant
for computer vision use real images we don't just need the image itself right we need to go in and
sort of have a person say what's in the image and draw boxes usually quite literally around it so
that process is called annotation to annotate a 200 kilometer swath in rgb photogrammetry can
cost upwards of 65 000 and that doesn't necessarily include sort of a lot of different examples
of certain objects so when we talk about rare um you know it doesn't have to be that it's something you never see.
It just has to be something that sort of you see only in one out of 10 of those.
And suddenly the costs of doing a data labeling campaign become really enormous.
And so we were able to show that for some objects where they only had, you know hundred or a lot of thousands of example images,
really two to three X performance improvements
in the detection algorithms we were building.
So that's in the remote sensing space.
We work with a number of customers in that arena
across both RGB and synthetic temperature radar.
We also have been working with customers
in other remote
sensing works so like drone based imagery and then i've been finding a lot of demand in security
imagery and then in things like industrial automation and robotics and then the the place
that i think is really interesting we've been getting the beginnings of demand and i think will
become important for synthetic data is actually in medical imagery.
So we've done some projects there.
And I think that that market is nascent, but it's an exciting thing to think about being
able to really understand more about how we can automate detection of things in the human
body by using synthetic data.
Okay.
So as I mentioned, synthetic data for training machine learning outcomes is something that's definitely kind of new to me.
I'm not entirely sure how new it is as an approach.
I just recently was speaking with another founder from a startup in your domain.
And he said that his background story was more or less
he started dealing with that space as a postgrad, basically.
And he was doing lots of deep learning visual algorithms
at the time.
And that's what got him hooked, basically.
And his exact words
and I'm calling him here it was like that to him it looked like a hack the fact that it worked
in the beginning. I don't know so his take is that what they do is they do simulations basically
and I wonder if this is something you apply as well and whether this is
in some way related to what i saw uh being referred to as the physics-based approach that
that you know that's that's exactly right so so when we talk about physics so i'm a physicist by
training uh so i'm coming into the cv world there so it's not a shock to me that we can learn new
things in about the world by simulating
physics. I think that's a huge portion of the field of physics is built on exactly this.
So it didn't seem like a hack to me. It seems sort of obvious, in fact, that there's a lot
of information. In fact, in some ways, what we're doing is translating from one form of how we can
understand the world, which is by writing down the physical equations that govern it,
to another form of understanding the world, which is essentially providing examples of that, which is how we train artificial intelligence. So that makes a ton of sense.
When we talk about physics-based, there's sort of two things that we mean, especially at Rendered,
when we say physics-based. The first is that we're contrasting
it with other ways to generate data. So the principal alternative in the world of sort of
data generation tools is the use of what are called GANs, or generative adversarial networks.
So with a GAN, we essentially provide like a lot of images and then teach an algorithm to make more like what we already have.
And the trouble with GANs is that you're not really introducing any new information. Does
that make sense? It's sort of you can make more examples around the information that you already
have, but you're not actually building in anything new. And then with, so with physics-based, we don't
do that. We actually are running simulations
and a variety of different types of simulations.
Now, the other distinction that's maybe helpful
is some of the folks that are working in synthetic data
really focus on RGB only.
And they're essentially using video game engines
to run these simulations.
And that's still, I wouldn't call that,
excuse me, that is still very much a physics-based approach. There's a lot of physics in those
engines. However, it's one specific type. It's not sort of broad-based physics, and it doesn't
lend itself to really the wide range of things that people need these tools in. And so as we've been building our platform,
we've been very conscious of the fact that we needed to make it extensible to a wide variety
of different simulation types, and then build partnerships with the companies that really had
depth of expertise in those arenas. And so that's what we've been doing is not just working with video game engine codes, but really deep physics knowledge from the companies that are the best at that world. to create synthetic data for autonomous vehicles, for example. But I'm not sure how you could possibly use it to create medical imaging
as you are doing.
So I kind of imagine that your approach must be somewhat different.
That's right.
And even when you get into like, if you look at autonomous vehicles,
well, it turns out autonomous vehicles rely on radar.
And there's some not great radar simulators built into, you know, game engines.
And the other thing is that those tools are just built with a different intention in mind.
So those, the game engine tools have typically been built with sort of real timetime video as a driver for essentially approximations
that they make. And that leads to, you know, if you play a video game, your eye can definitely tell
that it's a video game. You know, we're not to the point where it's indistinguishable from reality.
And sometimes that can have an important effect on algorithms. And so it is important to be able to sort of dial that knob
and change essentially the level of fidelity of the simulation.
And so the game engines are not always the best tool for that.
Okay.
So maybe it's time to revisit the spectrum of data that you deal with.
So you already referred to visual data.
And by the way, I guess if, I mean, you just mentioned why you don't solely rely on game engines,
but I wonder if it's in the mix that you use to generate data.
Definitely in the mix that we use. And in fact, if I can just tie that into one of the other questions that you use to generate data? It's definitely in the mix that we use.
And in fact, if I can just tie that
into one of the other questions that you asked,
because it's helpful here.
So it's not uncommon that the simulation
is not going to capture 100% of the fidelity
of the real world.
In fact, almost by definition, it doesn't.
You know, when we build simulation codes,
they are always an
approximation um at some level of the real world i spent a lot of time uh designing microwave uh
frequency components before starting rendering and there we would talk about in simulations that
you're not simulating the real world you're simulating the mesh that you can create of the
real world right um and that And that's very different.
And so you have this need to do two things.
One is overcome gaps with respect to reality
so that we're not introducing artifacts that can confuse AI.
And so when you asked, you know,
what do we mean by, you know, post-processing effects?
That's really what we're
talking about, is helping to overcome that so-called uncanny valley and improve realism.
And there's techniques that we can use to take a simulation or an output from a simulation
and then work to remove the artifacts that make it less usable in the context of artificial
intelligence.
The other thing that is sort of an obvious question to somebody as they enter into that space is how real is it and how real does it have to be?
So that makes sense.
I have this image.
Maybe my eye can tell their differences.
Maybe it can't.
But I want to be able to ask in a quantitative way,
is this going to be useful for AI training? And so we provide also
some analytics tools to help answer that question. And when we talk about engineered synthetic data,
that's really what we sort of mean there, is that it's not just data that we created off of some
video game engine. It's not just data for the sake of having more data. It's data that you can begin
to engineer around, that you can build data sets around and connect with likely outcomes. Okay, so how then would you go about generating
data that's not visual, like tabular data or audio data, which you referred to earlier?
Yeah, so the way that our platform works is, so there's two pieces to it. There's sort of a developer framework, and then there's a computer orchestration and librarianship environment. And essentially anything that you can script with Python, you could put into that developer framework. And I'm happy to sort of show it to you. If that'd be helpful, kind of do a little demo, you can get a concept of how and maybe why
somebody might use other forms of data in that environment. Sure. I mean, I've already seen,
I haven't exactly done a demo, but I've seen, you know, a few, I've seen your documentation
and I've seen a few screenshots. So I think I have at least a vague idea of how it works.
And it looks to me like the centerpiece, let's say, of it all
is probably what you call the graph,
which kind of lays out the structure of what happens,
what depends on what,
and the series in which events take place, right?
That's exactly right.
So what the graph is defining is not just like a piece of data,
not one image or one table, but a stochastic approach to generating them. You can use that
graph to continually generate additional data within some domain that's defined.
The way that we think about breaking up this problem, so I already mentioned
anything you can script up in principle,
you can upload into our platform.
And we provide a lot of example codes
and support around certain types of simulation
and doing more and more of that.
But there's a person who's writing those scripts
that sort of defining, you know,
what is going to be possible from different graphs.
And we consider that role to be the synthetic data engineer.
So it's sort of a new role.
And we do have tools to support them.
So we have a developer environment for the synthetic data engineer.
The use of the graph and sort of going from that application to data
is where we see the role of the computer vision engineer.
So what we have found is that computer vision engineers
aren't looking for a just add data set. What they really need is this ability to iterate on data sets, to be able to ask questions, to engineer can be defining sort of a variety of
different things that could be done in the world and then the computer vision engineer can ingest
that through the graph and say okay but these are the things that i want to see in this particular
data set in addition to uh the scripting i i think it looks like you also have like a sort of visual interface to help engineers create those scripts under the hood, let's say, because, well, kind of typing it all by hand would not be exactly easy.
That's exactly right. So typing it all by hand is not easy and it's helpful to visualize.
And then a sophisticated environment could have hundreds or more of these nodes and so
it's it's really helpful to be able to sort of zoom in and out and take a look at that so we
created that visual interface now our power users actually do a combination of programming using the
visual interface sort of that no code environment and then and then once it's created you can
actually download those graphs and edit them via script and so so we have people that will sort of build it visually first
and then go in and adjust things and loop through
and sort of automate different experiments.
And then they can upload those graphs via API
and create additional datasets.
Okay.
The example I saw, I think, used a specific picture that you use.
It looks like you use it throughout your documentation.
So it was like a box, I think, filled with a number of objects.
And it looks like you could define different objects and their properties
and the way they relate to each other or the way that they're located,
depending on each other and within the box and so on.
So is the role of the synthetic data engineer
to define those objects and those properties?
And then the visual engineer will sort of hone on that?
Yeah, so a couple of things.
The application that you're talking about, which is essentially toys in boxes, is actually one that we've open sourced.
And so it's a good example of how we are trying to help people enter the world of synthetic data.
So you can actually go and get access to all of the code necessary to build that channel and upload it or edit it and upload it to our platform.
The applications on our platform, we do call them channels.
And so I just used that term, that's what that means.
Sort of in the same way that skills are used for Alexa,
we call them channels.
So yes, the synthetic data engineer is sort of helping
to define what can be in there.
So in this case, sort of adding some 3D models,
defining what sorts of modifiers are available they're doing that with support obviously in the code that we
provide and then and then the computer vision engineer can define what they want to see happen
so in that particular channel we actually do you can you can move things around manually but a lot of the positioning
is actually done by randomly placing those objects above that basket and then running a physical
simulation of dropping them into the basket so that's how they end up being sort of uh positioned
um and so it's it's not it's not like you have to manually say okay this one here this one here it's
sort of a process to get to lots of different variety
and how they end up being positioned.
Okay.
So how would that process apply in a more complex scenario,
like medical imaging, for example?
And I think it's probably more complex because in that scenario,
you have more stringent requirements, basically.
So you can't just randomly place organs within a human body
and you can't just randomly generate spots in an x-ray.
They have to have some semblance of reality, I guess.
Yeah, absolutely.
So let me speak to one project that we've done and were public about and actually wrote a conference paper on, which was looking at sort of morphologies of oocytes, so early stage human embryos.
In that case, we were looking at a fairly constrained environment. So this is a microscope image of a cell. And a lot of what you're trying to do is sort of understand the morphology of this cell because it's a good indicator of health.
So in that case, we start with sort of an underlying 3D model of the cell.
And then we use things to warp it and we can kind of measure how much we're warping it.
And then we can add, this case yes somewhat randomly uh
other clusters of protein and other layers around it and in so in so doing kind of span the space
of what these uh what these oocytes really do look like in microscope imagery
in the case of of an x-ray you know there's a there's a few things that we're trying to achieve in something
like that.
So one is we can create, we can certainly take an existing, say, 3D model of the human
body and then introduce defects to it.
We can introduce things that would be spots, essentially, as you say.
The next thing that you want to do
is actually create a lot of diversity
in what that looks like.
And so you don't randomly place organs,
but you might want to do things like adjust the sizing
in the organs, adjust somewhat how they're placed.
And so those types of effects are exactly
what the synthetic data engineer is essentially doing,
if that makes sense. They're gonna start usually with an underlying 3D model. We use a language called modifiers. types of effects are exactly what the synthetic data engineer is essentially doing. That makes
sense. They're going to start usually with an underlying 3D model. We use a language called
modifiers. They would be choosing what kinds of modifiers are appropriate for different parts of
this 3D model, whether that's moving, scaling. There's a lot more sophisticated things we can do
sort of adjusting geometries and adding other things on. And then the computer vision engineer gets to decide
which of those are relevant.
Now, you said something that's actually really interesting,
which is like, you can't just go randomly place organs.
And the answer is, yeah, of course,
that's not very realistic to randomly place organs.
However, that might be an experiment
that a computer vision engineer didn't want to try.
Because there's this question about how much is my algorithm really learning, because it always
sees almost exactly the same thing versus how much am I really detecting the geometry here.
And so even though in your training data set, or maybe maybe in a real world data set, of course,
you wouldn't want to see randomly placed organs, That doesn't mean that that data set wouldn't be useful for some purposes.
And so it's important to sort of think maybe, and this is one of the powers of synthetic data,
we can broaden our boundaries further than what can be accomplished in the real world
and actually run some interesting experiments using that.
Okay. Hearing you talk about the things that synthetic data engineering and the
visual engineer would do, it made me think that probably both of them would benefit or would
actually have to sit together with a domain expert to get their feedback on what makes sense basically.
Yeah, so oftentimes the synthetic data engineer is a domain expert. That's what we have found
so far. You're right in the sense that there's the potential for more than two roles in the world of
synthetic data. And actually, even with that, I would say it's already sort of new to separate those roles and intentionally think about how synthetic data must be a collaborative process.
Like most simulation tools are built around a single user and they can be collaborative, but they're really directed to being a single user.
Most 3D modeling and game tools are built around a single user. Most 3D modeling and game tools are built around
a single user. One person can actually build these things, although they may collaborate.
Synthetic data is fundamentally multidisciplinary. So clearly the synthetic data engineer and the
computer engineer may have PhDs, maybe more than one, but in very different domains.
And it's difficult for someone who's spent their career doing computer vision to sort
of learn all the things that would be necessary to be successful in becoming a synthetic data
engineer.
So making sure that the platform supports that collaboration is really important.
Now when you talk about now collaboration with other types of expertise, we think that that's really crucial too.
And that's part of what creating that developer environment for the synthetic data engineer and others to work together supports.
And I forget if you commented on it in your questions, but part of what we do to support that is we actually have a content management server so that people can be uploading different types of content and then integrating that into these applications.
Okay, great. I was actually going to ask you about, well, your experience working with clients and how you see people, how you see role assignment basically in those clients that you've had exposure to.
And I think you just answered to
a great extent. I could go a little further. Sure, I'm just sure. I'm also conscious of time
however and I know that we haven't actually touched on what's supposed to be actually the
centerfold of this conversation which is the platform. I mean, we've certainly covered lots of ground in terms of how it works,
but I think we haven't actually covered an end-to-end scenario.
I know, for example, you mentioned collaboration,
and I know that this is one of the things that you support.
You mentioned collaboration and I know that this is one of the things that you support. You mentioned different roles.
So would you care to describe like an end-to-end scenario?
So how do I use it if I have a scenario for synthetic data in my hands?
Yeah, absolutely.
So typically like the client will come to us and usually the first thing they're going
to say is, I don't have enough data and then for
for most of our clients we have some kind of a starting point where we have existing code maybe
for you know remote sensing or for medical or right we've got something and they'll say that's
interesting but it's actually not what I need right I'm not'm not looking for cars. I need to find these particular types of trucks
in this imagery. And so what we do is we build a plan with them for how we introduce the content
that they need into this particular simulation domain. The reality is that for most clients,
they don't come with a synthetic data engineer today. And so what we do is a mix of training
them, sort of how they can
introduce content and then supporting them in that process. And I think of it as sort of very similar
to the early days of Salesforce, right? Salesforce has this wide variety of things that can be built
on top of it. And there's a whole ecosystem of people that can do that as contractors. Right now,
we're providing that work, but we're also teaching people how to build on top of the platform. So we'll engage with them to get their content integrated
into the tools and then what they get is a is a channel so that so now they have an interface
that's like the one we show with the graphs and all of those other support infrastructure that's
available to their computer vision users.
And at that point, the computer vision users start to run different experiments. And we typically will work with them closely through that sort of initial success, that initial improvement
in AP score. And after that, what we find is they like to continue and run all sorts of experiments.
And one of the things that I'd say in short is,
it turns out artificial intelligence is not the first kind of software that doesn't have bugs.
And so, you know, they have all sorts of things that they need to check and improve and synthetic
data ends up becoming really important in that process. Okay. Another thing that caught my eye about the description of your platform was the reference to high-performance computing and Amazon, Amazon Web Services in particular.
I think it looks like there's some kind of integration going on there, and I was wondering if you could elaborate on that.
Yeah, absolutely.
So let's just start on the why first um if you want to run um if you want
10 000 examples of deep physics simulation uh it turns out that requires a lot of compute so that's
not something you're going to run on your desktop so part of what we provide is this orchestration
framework and that that runs on top of amazon so what we're doing is essentially providing cost-effective orchestration on AWS.
Okay, so I guess in practical terms, that means that, well, if I need to generate some synthetic data and I'm using your platform, then I only get billed once, basically, by the platform.
I don't have to procure and purchase my own
AWS instances and so on. We actually go even further than that, George. All of our subscriptions
have a built-in unlimited plan for compute. And essentially what we do is we cap the concurrency
of the GPUs you can use on AWS, but otherwise allow our customers to really experiment.
One of the things that we found in this space
is that it's really crucial for people
to be able to try experiments,
to run different things,
to be relatively untethered
when it comes to what they can do.
And so we felt it was important
to bake that compute into the actual plans.
Okay, that's both well handy and generous, I would say.
And I wonder how did you manage to make it financially viable for you?
Do you have like some kind of a special partnership with Amazon or?
We have a deep partnership with Amazon, but a lot of
this is looking at typical usage. Think of it like a cell phone plan. You have an unlimited cell phone
plan. If you abuse it, there's ways, of course, to throttle. There's ways to make sure that it's not
creating a problem. But at the same time, you know, you're no longer thinking
about whether or not you, you know, are going to go over your gigabytes. And we manage it in much
the same way. In fact, I come from a communications industry background. So kind of familiar with
doing exactly this type of plan, where I'm not worried about whether you sort of go over on one
particular month. Does that make sense? It's more about, I want you to be
free to experiment. And I really think it's important that people are able to get to value
with synthetic data. And so we've structured our go-to-market business plans accordingly.
Okay. Do you find that resonates with clients? I guess it probably does.
Thinking as a client,
it would definitely resonate with me
if I knew that by signing up,
I don't have to worry about instances
and bills and so on, basically.
Exactly.
And I think what we find is it's less somebody saying,
oh, well, then your cost of compute is cheaper than my
alternative cost of compute what they really say is oh okay so then it's a complete plan i don't
need to be thinking about i bought this from you and then every time i run an experiment i'm going
to get another bill um and so you know that that definitely resonates with people. Okay. And speaking of which,
it's probably a good time to talk a little bit about your business model and your subscription plans.
I think you have like a free tier
that people can use to play around.
And then you also have a couple of other ones.
So we have a free tier
and that comes with a capped total amount of compute, so 10 hours
of EC2 usage. And we offer that in conjunction with this example channel so that people can come
on and play with it. And they can actually download it and edit it if they want to
change the parameters and then run. The second tier is what we call our developer subscription.
And so, you know, George, we were talking before about how, you know,
there's a synthetic data engineer and a lot of clients don't come with a
synthetic data engineer, but there are increasingly people that could help in
that role.
So whether that's domain experts or 3D graphics experts, there's
people that can help to build these applications. And what we find is they don't need to pay for the
large amount of resources required to do bulk simulations for computer vision, but they need
enough that they can build applications and test them. And so we have a plan that's really directed
towards that, this developer plan, and that's a $500 a month plan. And so we have a plan that's really directed towards that, this developer plan.
And that's a $500 a month plan.
And then for our professional users,
really for the computer vision users who do want to be able to run experiments,
that want to be able to plow through relatively large data sets
and get insights quickly,
we have our enterprise plan that's $5,000 a month.
Okay. And, you know, as we just mentioned, that also includes
well all the compute you can get.
So the, the, there's there's different kind of consumables
with each of those in terms of the level of collaboration and
things. But the developer environment comes with a max of
five EC2 instances in parallel, and the enterprise one comes with 50.
So you can get up to 50.
And if you need more, then we have other ways to engage with people to get to larger chunks of compute and other resources.
I see.
So if I'm not mistaken, this is sort of like the official, let's say, unveiling of the platform.
And I wonder how long have you been working on it since 2019 that you founded the company?
And now it's at the point where you're ready to officially open it up.
Yeah, I mean, it really has been since 2019.
And I'd say, you know, some of that has been,
there's a lot of parts to make this work.
You know, it's computer orchestration,
it's a visual environment, it's librarianship,
it's a whole set of APIs and SDKs,
post-processing analytics tools.
And so it's a pretty significant undertaking to build a platform like this. And so
it has taken us a while, but it's also something that's in an industry that's new. And so as we've
been doing this, right, we've been working with clients the whole time and really trying to
integrate what we have learned about what is helpful to them
and what's necessary to be successful into the platform that we're actually providing.
Yeah, I was going to say, it looks like you have some good indicators.
So first, it looks like you already have a few paying customers.
You already had a pretty high seed round.
I thought it was, that's actually why I
thought it was around A, because well, 6 million could well be around A. And yeah, you have
accomplished what seems like a good deal of goals in a couple of years. So what's next having reached this milestone? What are your future plans?
So here's the way that we think things will work, George, because we think synthetic data is early.
And I think that most people are just kind of starting to understand synthetic data, maybe
much like yourself. They're hearing this. There's a whole industry of people who are saying, well,
my projects are limited in terms of data. What what could i do with synthetic data i'd like to understand that more
and and the platform is really going to help accelerate them but it's also crucial that they
get that they have some kind of content ready to support what they're going to be doing and i
mentioned you know earlier in the conversation when a client comes to us, usually we'll sort of demonstrate something that's close and they'll go, okay, that's not quite it,
but I need this. And that is the beginning of the engagement process for us.
So what we're going to be doing now over the course of the next month is actually continuously
releasing different types of content. You'll see on our website, we've got microscopy and
security imagery and all these things. So over the next months, we're going to be starting to release
those in largely open source tools that people can use to build on top of
within our platform and we're really trying to understand is how
what how do people need to get those those initial environments?
Right. What's what sorts of code do they need, delivery do they need,
training, tools, et cetera, in order to be able to move from that into sort of successful
use of synthetic data. So that's a lot of what the next few months is going to be about for us.
Okay. So to extrapolate, let's say kind of taking the same approach that you currently have with your first open source application, then extending that to other domains as well.
Exactly, exactly.
So we'll be releasing code across a wide variety of domains and with some of the best known source partners that represent the best known kind of sources of content
and simulation tools.
And so you should expect to be seeing quite a bit from us
in terms of integrations with well-known companies
in order to help people kind of use those tools
in a synthetic data context.
Okay, sounds like you have lots of work ahead of you, but also a well thought out run. Synthetic data is early.
I certainly think we can do a lot to help people in this industry. And I think this is one of the most important problems facing AI, if not the most important problem facing AI.
So I'm excited to be able to help out.