Orchestrate all the Things - The state of AI in 2020: democratization, industrialization, and the way to artificial general intelligence. Featuring AI investors Nathan Benaich and Ian Hogarth
Episode Date: October 1, 2020From fit for purpose development to pie in the sky research, this is what AI looks like in 2020. A discussion on all things AI, with authors of the State of AI 2020 Report, Nathan Benaich and Ia...n Hogarth. Benaich and Hogarth work on the intersection of industry, research, investment and policy, with extensive background and various currently held positions such as venture capital investor and researcher. This gives them a unique vantage point on all things AI. Their report, which is published for the 3rd year in a row, is their way of sharing their insights with the AI ecosystem at large. Article published on ZDNet
Transcript
Discussion (0)
Welcome to the Orchestrate All the Things podcast.
I'm George Amatiotis and we'll be connecting the dots together.
Today's episode features a discussion on all things AI
with authors of the State of AI 2020 report, Nathan Benoist and Ethan Hogarth.
Benoist and Hogarth work on the intersection of industry, research, investment and policy
with extensive
background and various currently held positions such as venture capital
investor and researcher. This gives them a unique vantage point on all things AI.
Their report, which is published for the third year in a row, is their way of
sharing their insights with the AI ecosystem at large. I hope you will enjoy
the podcast. If you like my hope you will enjoy the podcast.
If you like my work, you can follow Linked Data Registration on Twitter, LinkedIn, and
Facebook.
Well, it would be nice to say a few words about you and why you do that.
And I would also ask you to report a little bit on the meta of the report.
And the reason I'm saying that is because, honestly, it was both interesting and a little bit on the meta of Underreport. And the reason I'm saying that is, well,
because honestly it was both interesting
and a little bit of a pain to go through that
because it grew in size considerably from last year.
So I'm wondering, you know, whether you expanded the team,
which I think you did,
and how long it could possibly have taken you to compile this.
Yeah, well, I mean, there were definitely points
when we were making slides where we were wondering
why we're doing it.
But I think that, you know, it goes back to, you know,
conversations that Nathan and I have had.
And, you know, obviously Nathan and I are both investors,
Nathan through Airstreets, his venture capital firm,
and myself as an angel investor.
And one thing I've always loved about interacting with Nathan
is that he takes a very rigorous, almost academic approach
to following what is happening.
And I found that we were talking as much about research papers
and sort of interesting policy documents as we were about startups.
And I think both of us felt that we were in quite a privileged position
because we were interacting with some of the world's leading AI researchers,
investing in some of the sort of the most interesting early stage technology companies.
And, you know, in my case, doing quite a lot of work on the policy side
and interacting with some people in sort of government around AI.
And so I think we kind of felt that we had this unusual vantage point that spans lots of different kind of ways of looking at the field and when we would talk to researchers they would not be that
familiar with what was happening in say startup land and when you talk to sort of government
people they're not necessarily that familiar with what's happening in say research world and i think actually you know
we believe that the more the more we can join those dots up the better the better will you
know better things will go for the field and the more likely we are to make sort of uh responsible
choices that span research you knowization, and policy.
So it's really a sort of, I guess,
a piece of open source work that we put out there.
We're not researchers ourselves.
And so it's sort of something that we can contribute back to the wider community of machine learning
to sort of move things forward
and hopefully sort of steer discussion and collaboration
across these different areas.
Thank you.
Yeah, I think that sums it up quite nicely.
And then maybe to just answer your question on who makes it
and whether we have a team.
Yeah, we have discussed like growing the team,
but it's essentially the handcraft work of Ian and myself.
We take probably, I guess, two months or so to put it together.
But it's the result of quite detailed and systematic exposure to all these different areas over the last 12 months by virtue of our um you know
investing work but also by virtue of the newsletter that i write the guide to it and if anybody
listening to your podcast uh is looking for a paid internship uh then then perhaps we can uh we can
uh we can talk well i'm sure that there may be people interested in that, and I can appreciate the amount of works that goes into it.
Well, you know, there's nearly 180 slides,
so just, you know, casually browsing through them
would take you something like three hours, I guess.
That's how long it took me, at least.
Thanks for taking the time.
Yeah.
So it's an interesting coincidence, you know, that your report is coming out now, because as you probably
know, Gartner's latest Gartner Hype Cycle on AI was just released yesterday as well.
So it's not nearly as extensive as your work, but there's two interesting main points in it, which I would
like to get your opinions on, because I find them interesting because I basically disagree
with both.
So the main points that they put forward is what they call the democratization of AI and
the industrialization of AI.
So in my disagreement in terms of democratization,
and I think it's a point that you also touched in your report,
is that, well, define democratization, basically,
because it seems to me that it's a very expensive and elite sport
to be doing AI on the top level, at least.
And this is something that I see clearly outlined in your report as well.
And as far as industrialization goes well yes I mean things are improving there's you know more streamlining in terms of tools and availability but in my opinion getting a model from you know
from the lab to production is still more craft than more craft than than science let's say so
but what do you think on those
yeah um i mean i think those are those are two also kind of um important topics that we can pull
out in the report um so i think with the the notion of democratization there's you know a couple of
angles we can take on it.
The first one that we think is super interesting is this notion of AI research being open or closed.
And what does that actually mean?
On slide 11, we look at data that describes archive publications. It essentially asks of the publications that are in archive, how many or what proportion of them include the code base that's been developed and used to produce the results that
are published in those papers. And what you find is that, you know, for the last two, three years
or so, that number has been very, very low. So it's only around 15% of papers that actually publish their code.
So to the question of democratization, you know, one of the crucial components of democratization is reproducibility and openness and the fact that, you know, tools are modular and can be
built on one another and exchanged freely. So, you know, we think that this kind of topic of
research being less open than you'd think
you know deserves a bit of a bit of discussion and consideration for how that will impact the
field moving forward just building on that i think it's you know it's worth saying that
historically machine learning has been incredibly open you, a huge amount of sort of, uh, you know, open data sets and,
you know, uh, you know, open up, you know, use of the archive, et cetera. And I think that any,
uh, company or large research organization kind of, um, coming into the field has had to sort of,
uh, respect that to some degree, right. So there's only so closed it can get without a sort of major
um you know major backlash but at the same time i i think that uh you know this data would clearly
indicate that the field is certainly sort of finding ways to be closed when it's convenient
yeah um but but having you know having said that um elsewhere in the report
we look at perhaps
a subject in open source that relates to
democratization but also industrialization
and that's the topic
of what are the popular GitHub repos
these days
and what we find is that
in Q2 this year some of the more popular
and fastest growing GitHub repos
are machine learning based.
But within that category,
they generally relate to MLOps,
like machine learning operations.
And your audience will probably know this well,
but essentially DevOps applied to machine learning,
which is when you do have your model in production,
how do you make sure that over time it's still creating value?
And if that value suddenly gets destroyed for either changes
in how your users are engaging with your product
and what kind of data they're creating,
what can you do as a developer to fix that
and interpret when issues are actually being flagged.
So, you know, at least by virtue of kind of communities in the engineering and open source world
that we live in and interact with
and the startups that we invest in,
it's certainly looking like investments
and interest in MLOps as it relates to the progress
of machine learning outside of R&D or experiments
and into real world production production for real-time systems
is definitely growing.
I think it's probably worth saying that in general,
I would say that we see brilliant startup founders
probably finding it easier to get started
than they would have done a few years ago
in terms of tooling available to them and sort of the sort of maturity of the sort of infrastructure.
But if you wanted to start a sort of AGI research company today,
the bar is probably higher in terms of the compute requirements,
particularly if you sort of believe in the scaling hypothesis and the kind of the idea that, you know, taking approaches like GPT-3 and continuing to scale them up
is going to be more and more expensive and less and less accessible to sort of newer
entrants without large amounts of capital.
Yeah, that was a point in your report which I found particularly interesting
and one that I think many people, at least people who are not in the inner circle,
let's say, of how AI works, don't fully grasp the amount of resources,
compute and data and also people's energy that goes into training those models and that makes them in reality not
really accessible to well to pretty much anyone but a very few select organizations and I think
that one of the other things that piqued my interest in your report was, well, a kind of advice, let's say, or path to success on how those models can be interesting and useful to others
beyond the organizations that produce them.
And you suggested, if I'm not mistaken, that, well, perhaps an idea would be
to take those pre-trained models and actually fine-tune them to specific domains.
Have you seen cases where people have done that with success?
How easy do you think that's to do?
Yeah.
So I think one of the interesting examples of this, you know,
take a large model or take a pre-trained model in one field and move it to
another field and sort of bootstrap performance to a higher level than if you were to not do that.
And also that plays into one of the dominant themes in the report is, you know, the slide
where we talk about confocal microscopy and basically like using imaging to understand
biology and treating it as a similar kind of task to ImageNet,
where, you know, so much of the improvements in network architecture and network performance
and computer vision was, as we all know, driven by carefully curated data sets from which
models can learn something useful.
And, you know, as biology and healthcare has become an increasingly digital domain with lots of imaging, whether that relates to healthcare conditions or what cells look like when they're diseased or normal, compiling data sets that describe that kind of biology and then using transfer learning from ImageNet into those domains has yielded much better performance than starting from scratch.
So that's the example I would probably highlight.
And then kind of related to that outside of computer vision,
but more in language models,
we're seeing several examples of, for example, language models being useful
in protein engineering or in understanding DNA
and essentially treating sequence of amino acids that encode proteins or DNA as just another form of
language,
a form of strings that language models can interpret just in the same way
they can interpret, you know, characters that spell out words.
One thing we don't really flag in the report george but i think is kind of an additional
subtlety is obviously you know you can highlight the sort of the potential costs of training a
model but the other thing that organizations with very large um you know amounts of capital can do
is run lots of experiments uh and and sort of you know iterate in iterate in these kind of large experiments without
having to worry too much about the cost of each training run.
So, you know, there's a sort of a degree to which you can be more experimental with these
large models if you have more capital.
Obviously, it also does slightly bias you towards these, you know, almost brute force
approaches of just applying scale and more capital and more
data to the problem but i think that you know if you buy the scaling hypothesis then that's a that
you know then that's a uh that's a you know kind of i think a um you know a fertile area for
progress that shouldn't be dismissed just because it doesn't have you know deep intellectual insights at the heart of it.
There's an interesting example that you mentioned in the report of a model that theoretically shouldn't be able to compete against bigger models
with more parameters and more training.
And I'm referring to the model used by a company called PolyAI and you showed how
this actually performed better compared to you know very uh both compared to models with more
parameters and more resources behind them such as BERT for example so I was wondering if you have
any insights as to why uh that is you know what what is it that they did so well that managed them,
that enabled them to compete and actually beat those models?
Yeah, I think the main point here is research engineering in big technology companies has
been increasingly about publishing more general purpose models.
This idea of one model can rule them all or conduct different tasks.
And that's what research teams are fundamentally excited and interested in.
And that's kind of contrasted with comparatively smaller companies that are very domain focused,
largely kind of tackling use cases that are at the periphery of large technology companies.
And in those cases, like, yes, research is important, but actually in those cases to get models to work in production,
you actually probably have to do more engineering than you have to do research.
And almost by definition, like engineering is not interesting to the majority of researchers.
And so in this sort of slide, this work that we described from PolyAI,
they're a dialogue company to deal
with conversations in customer contact centers.
And we're essentially showing that the tasks that they have
of detecting intent and understanding
what somebody on the phone is trying to accomplish
by calling,
for example, a restaurant is solved in a much better way by treating this problem as a,
what they call a contextual re-ranking problem, which is given a kind of menu of potential
options that a caller is trying to possibly accomplish based on our understanding of that domain, we can design a more appropriate model
that can better learn customer intent from data
than just trying to take this kind of general purpose model,
in this case BERT,
that can do okay on various conversational applications,
but just doesn't have kind of like the engineering guardrails
or the engineering nuances that
can make it robust in the real-world domain.
And the kind of interesting TLDR from this is that this model that they published is
significantly smaller by parameter size than BERT.
And while that's less news-catching in the research field is actually more relevant for
production applications because you can learn from less data and be effective with less data
and it can also be trained on a lower kind of computational footprint which means that it's
more accessible to companies and less costly to scale um so i think this this is an interesting
example of that is sort of like contrasting priorities between the technology companies that focus on cutting-edge general-purpose research
and startups that care more about when machine learning hits the road,
how do you make it robust and useful?
Yeah, that's interesting. In a way, this kind of brings us to the topic of bias, let's say,
in whether you should use it and in what way, and to the broader topic of, well, language
models and how they're built and how they work. And to the best of my knowledge the most prominent critique of the approach taken by
models such as GPT-3 and all the predecessors is Gary Marcus who has made the point of you know
pointing out the deficiencies in those. I'm sure you're aware of this critique and it basically
comes down to you know what you consider bias and whether
it's good and how you should insert it in the model and all of those things so i wonder what
what your take is on that yeah so i mean my my interpretation of his kind of critique is really
that you know gbt3 is an amazing like language model that can take a prompt and output a sequence of text
that is legible and comprehensible
and in many cases relevant to what the prompt was.
But there are numerous examples where it veers off course,
either in a way that expresses bias
or that is just gibberish and not relevant.
And elsewhere in the report, we show a new kind of benchmark published by Berkeley, which
exposes some of these issues across various kinds of academic tasks.
I think the interesting kind of extension towards what GPT-3 could do is, and kind of
relates to our discussion around poly AI,
is this aspect of injecting control,
like some kind of toggles on the model
that allow it to have some guardrails at least,
or at least kind of tune what kind of outputs
it can create from a given input.
And there are different ways
that you might be able to do this.
And I think in the past we talked about knowledge bases
and knowledge graphs,
or perhaps even some kind of learned intent variable
that can be used to inject this kind of control
over this more general purpose sequence generator.
So I kind of viewed through that angle.
I think his concern this is certainly valid you
know to some degree and I think points to your kind of next generation of what generative models
like GPT-3 could could move towards if if the goal is to have them be useful in production
environments yeah I think I mean Gary Marcus is kind of a almost a professional critic of
organizations like deep mind and open ai and i think it's very healthy to have those critical
perspectives um you know when there is such a sort of reckless hype cycle around some of this stuff
um but at the same time i I do think that, you know,
OpenAI has one of the more thoughtful approaches
to policy around this stuff.
You know, they seem to take, you know,
their sort of, you know, responsibilities seriously.
You know, for example, you know, the approach they took
to releasing, you know, to releasing various models in the past and the work they've done on malicious uses of AI.
And I think in general, Nathan and I hold their policy team and Jack Clark in particular in very high regards.
So I think the OpenAI is trying um i think what's sort of what's maybe concerning about the the sort of you know their
approach and the scaling hypothesis is you know it feels like they're kind of saying if you throw
more data and more compute um uh and build larger and larger models at some point you sort of move
beyond this kind of um sequence prediction into some kind of emergent intelligence and
obviously some people don't really agree with that you know that that theory of how we achieve agi
but if that's if they're right right let's say they're right and and the critics are wrong
then we might have a uh you know a very smart but not very um you know uh well adjusted agi
on our hands um as as evidenced by sort of some of these early instances
of bias as you scale these models.
So I think it's incumbent on organizations like OpenAI
if they are going to pursue this approach
to tell us all how they're going to do it safely
because it's not obvious yet from their research agenda or obvious to me how you marry
AI safety with this kind of throw more data on the computer problem and AGI will emerge
approach.
This points again towards another interesting part in your report, where you mentioned that there's a number of practitioners who feel that progress in mature areas of machine learning is stagnant, and there's a brute force approach whereby you can just
throw more compute and more data
at the problem and at some point
you'll just solve it this way
or whether you need to add something else
to the mix, which is I think what people
like Gary Marcus are advocating
for. So
what's your stance
on this dichotomy?
I'll start there.
I mean, I think that causality is kind of, you know,
it's kind of, you know, arguably at the heart of much of human progress, right?
You know, from a sort of epistemological kind of perspective,
causal reasoning has given us you know uh the
scientific methods um it's you know it's at the heart i think of of of our best world models
um and so i think that i i'm personally incredibly excited about the the work that people like you dad pearl have have pioneered
here and i'm excited about us figuring out how to bring more causality into machine learning um
my sense is that it feels like the the biggest potential disruption to kind of the general trend
of you know uh larger and larger um you know correlation driven models um because i
think if you can if you can crack causality you can start to to build a pretty powerful um sort
of scaffolding of of knowledge upon knowledge and have machines start to really contribute to our own
knowledge bases and and sort of um scientific processes um so i think it's very exciting and
i think there's a reason that some of the smartest people in machine learning are sort of spending
their weekends and evenings working on it um but i think it's still it's still in its its kind of
infancy as a as a sort of as a as a as a kind of um as an area of attention for the the sort of the
commercial community in machine learning.
I think we really only found one, you know, one or two examples of it being used kind of in the wild,
one by Faculty, a London-based machine learning company, and one by Benevolent AI in our report this year.
But I think, you know, we're excited to see more of this.
And I do think it could be a kind of a pretty powerful dislocation
of kind of business as usual in machine learning
and sort of correlation-based curve-fitting approaches.
Okay. Thank you.
Nathan, you want to do something today?
yeah I mean I'm just probably opining on the like practitioners feel that progress is stagnant
I think
it generally relates towards
like
the feeling of researchers that you know
there's not much that lies beyond deep
learning and
like these generative models.
And we've been discussing
it's kind of like hyperscale mentality.
People just, I think,
just anecdotally feel a bit sick
of the throw more computer at the problem
and it's going to fix everything under the sun.
So I think they, especially in research,
they're looking for something
that's a bit more like intellectually inspiring
or intellectually novel.
But I think outside of research,
I would say that the application area
is far from stagnant.
And actually even in domains
that haven't seen much impact of A, computer science
and B, machine learning,
such as healthcare and biology, we still highlight a couple of examples, actually, of startups
that are at the cutting edge of R&D put into production for problems in biology.
I think the couple ones that we would highlight is, in for example one problem of of uh of sort of like drug screening
and figuring out okay if i have if i have a software product that can you know generate
lots of potential drugs that could work against the you know disease protein that i'm interested
in targeting how do i know out of you know thousands or probably like hundreds of thousands of possible
drugs which one will work um and assuming i can figure out which one might work how do i know if
i can actually make it um and so you know there's a couple of startups working on this and here we
profile some work from posteros based in london in the u.s and you know they use machine learning
and some of these more modern transformer architectures
to teach chemists or suggest to chemists what route what kind of recipe mix one would use to
make a molecule of interest and this kind of this kind of work is a like state-of-the-art and be
you know really interesting and important package as part of the overall problem of drug discovery.
And then some other work that we profile from in vivo AI in Canada.
And here, this startup is using graph neural networks
to learn representations of chemical molecules
and say, okay, can we predict
whether this molecule is solvent?
Whether it binds to a specific target, is it toxic,
and do that all from just like the chemical Lewis diagram
that we learn in Chemistry 101 or organic chemistry.
So I think this is like state-of-the-art as well
and is actually adding a lot of potential value to industry.
And there's some examples of progress that's far from stagnant
in kind of more emergent industry domains.
It's great.
You touched on two topics that I actually wanted to follow up with.
So let's start with the first one, graph neural networks.
I've seen very big interest on on those and I can understand the
reasons. You also kind of touch upon them in the report. So it's a kind of change of paradigm in
what data you can process. So with the typical kind of neural networks, you can process two
dimensional data and graph neural networks, you can process three-dimensional data and graph neural networks you can process three-dimensional or well any beyond the two dimensions so that's quite quite a breakthrough i would say and it
lets people take advantage of connections basically and extra information in their data
so i've seen lots of interesting work lately on that and lots of interest as well. I'm sure you keep track on that.
So what would you pick as the highlights in this subdomain?
Yeah, this relates to some discussions I have
with some friends who are from the pure machine learning domain
and see a lot of excitement in biology
and just want to understand how to think about problems in biology from a machine learning standpoint.
And I think it kind of comes down to one topic, which is what is the right representation
of biological data that actually expresses all the complexity and the physics and chemistry
and sort of like living nuances of a biological system into like a compact,
easy to describe mathematical representation that a machine learning model can do something with.
And, you know, as you described, a lot of the existing models kind of treat,
you know, we'll sort of treat problems as like vectors or 2D representations. And it's sort of sometimes hard to kind of conceptualize biological systems
as just like a matrix array or a vector or something like that.
And so it could very well be that we're just not, you know,
exploiting all of the implicit information that kind of resides
in a biological system in the form of a vector.
So I think that's why the graphical representations are at least an interesting kind of next step,
because they just feel so intuitive as a tool to represent something that is intuitively connected,
as a chemical molecule would be, because it's connected to atoms and ponds and things like that. So we've certainly seen examples in molecule property prediction and chemical synthesis planning,
but also in trying to identify novel small molecules by essentially treating small molecules as a combination combination of small like lego building blocks and so you can
you can use advances in you know dna sequencing where you can attach a little tag to a small
lego building block of a chemical then you can mix all of these chemicals into a tube
with your target and then you can essentially see what building blocks have assembled together
and bind to your target of interest.
And then that's your like candidate small molecule that seems to work.
And then you can use these graph neural networks to try and learn what
commonalities these building blocks have that make them really good binders of
your target of interest.
And this is some work that's been published with some startups in Boston and Google Research
that's essentially adding this machine learning layer to a very kind of standard
and well-understood chemical screening approach
and generating several-fold improvement on the baseline.
So I think that's also super exciting.
I saw a very interesting analysis by Chaitanya Joshi, who was essentially arguing that kind
of, you know, graph neural networks and the transformer architecture and sort of tension-based methods are kind of,
you know, kind of the same underlying, have the same underlying logic, you know, where you can
kind of think of sentences as essentially fully connected word graphs. And I think that, you know,
one thing we noticed a lot during the report this year is the way that the transform architecture
is creeping into lots of unusual use cases. wouldn't have predicted it to be used for.
And then secondly, that scaling it up is obviously having more impact
in terms of the performance of these larger language models.
So I think that maybe the meta point around both graph neural networks
and these attention-based
methods in general is that they seem to represent uh a sort of a general enough approach that
there's going to be uh there's going to be progress just by continuing to hammer very hard on that
nail for the next few years and i i one of the ways I'm sort of challenging myself is just to sort of take a minute
and just assume that actually we might just see a lot more progress
just by doing the same thing with more aggression for a bit.
And so I would assume that some of the gains that are being found in these GNNs sort of cross-pollinate with the work that's
happening with language models and transformers.
And that approach continues to be a very fertile area for sort of super general kind of AGI-esque
research.
Okay, thanks. And the other point that, well, GN it looks kind of obvious but you may confirm it or
not that probably the advent of COVID had had something to do with that even though it was
possibly going on already but yeah if you would like to say a few words about what you have seen in that domain.
Yeah.
So I think the main kind of applications in AI for biology are one in like drug discovery and then the other one in kind of clinical medicine.
And so in clinical medicine,
I think there's a couple of um like really exciting developments
um that i think are like very powerful signals for where the field's going and kind of its
state of its maturity um the main one is actually that you know for the first time ever um you know
the u.s medicaid and medicare system that pays for um procedures in the U.S. has actually approved
a medical imaging product for stroke that's created by Viz.ai. And so despite a lot of
FDA approvals for deep learning-based medical imaging, whether that's for a stroke or mammographies or broken bones,
this is the only one that so far has actually gotten reimbursement.
And many in the field feel that reimbursement is the critical moment because that's the
economic incentive for doctors to prescribe because they get paid back.
And so we think that's a major event.
Still a lot of work to be done, of course, to scale this
and to make sure that more patients are eligible for that reimbursement,
but still major nonetheless.
The second area is in drug discovery.
And there, what's worth highlighting is that a business that was originally found
in Scotland and largely under the radar of most technology press hype has been the first
developer drug through machine learning methods that have now entered phase one clinical studies
in Japan for treatment of OCD. And that business is Excientio.
And that same company has also managed to confirm a licensing deal for assets that are
discovered through a quarter of a billion dollar project with Sanofi that just started
two years ago and has now essentially been executed, which kind of proves out that large
pharma companies are actually getting value from working with, you know, first drug discovery companies.
So we kind of think this is like the two major kind of industrial moments, but, you know,
beneath the surface, there's tons of activity across those segments overall.
Yeah, I think that, you know, one, I always sort of go back to carlotta perez's kind of uh
you know framework for thinking about you know how financial capital interacts with you know um
you know technological progress and i think we've certainly started to have the kind of the
speculation phase right where lots and lots of capital is flowing at these kinds of the
intersection of machine learning and biology
and you know there are going to be some really amazing companies that come out of it and i think
we will start to see real you know a real deployment phase you know kick in um and then also
i'm sure there's kind of a theranos hidden in there somewhere as well that's something that's
going to be a total revealed to be a total fraud. But we feel pretty excited
about the potential here. And I think that in particular,
we think that companies like LabGenius, in which
we're both an investor, and XINC are examples
of companies that are likely to be doing some really quite profound
work over the coming years.
Okay, I think we're almost out of time.
So maybe one last question and we can wrap up.
And I guess it has to be AI ethics.
I mean, after touching upon the use of AI in biology
and you actually, this is not the only domain
in which you also referenced in the report.
So speaking about COVID,
for example, and how AI is used in image recognition systems and all those things.
So you put forth a number of interesting observations and suggestions in the report, but the main
question for me would be, well how how can we make them real I
mean how can these things be even enforced what sort of things sorry you
make a you make some interesting observations in terms of AI ethics so
you know a set of guidelines or rules that should be observed. The question is
what
is a real way to
enforce whatever guidelines
we decide are appropriate?
I think there are
at the start of some quite interesting approaches
to regulation where
Nathan,
do you want to jump to the slide about the UK use of facial recognition?
Yeah. um so you know this is an interesting example where um you know a a uk citizen basically claimed
his human rights were breached whilst christmas shopping um and you know uh the ruling was kind of interesting in that he was you know he was
ruled against but there was also this kind of this duty to the to the to you know duty um to the
police in making sure that discrimination was proactively eliminated from technology being used
so you know legally the police are now kind of on the hook for getting
rid of bias before they can use this software and so it creates a kind of a uh you know it creates
a much higher bar to deploying this um this software and it creates a a sort of almost a
a legal opportunity for uh you know, anyone who is,
anyone who experiences bias at the hands of an algorithm to have, you know,
foundation for suing the government or a private actor deploying this technology.
So I think this is one interesting approach where you essentially say
effectively the software has to be, you know, demonstrated, you know,
demonstrated to make, you know demonstrated you know demonstrated to make um you know uh you know
um uh extreme efforts to remove bias and be sort of you know ethical in that regard
um obviously there's many many aspects of ethics but you know bias being one
and i think that that kind of that places a much greater burden on the um on the on the the entity deploying the software um the other
approach that i think is interesting is kind of some degree of kind of um api driven auditability
so there's um i think it was in i think it was in washington state where they have made it so that
any facial recognition system has to have um uh has to have an api that would
allow an independent third party to assess it for um you know performance and and bias across
across different categories of identity so i i think there's kind of this like interesting um
you know a couple of approaches emerging where law enforcement
are sort of figuring out how to police the use of this
and how to, sorry, not law enforcement,
how regulators are figuring out how to incentivise ethical behaviour,
either by introducing third parties in a novel way,
like this kind of this API driven approach,
or just by saying that the use of this is held to standards that then open
the users up to lawsuits if they don't meet those standards.
So I think we're starting to see some,
some sort of regulatory innovation in this area.
And I think that now that this is so prime time,
you're going to see even more kind of
even more emerge on the regulatory side to try to restrain use of these algorithms
to kind of ethical approaches. Okay. Well, thanks.
I guess we're
a bit over time now, so
if you don't have anything
to add to that, Nathan,
we're probably going to have to
wrap up here. Yeah, sounds good.
Thanks a lot for
some really good questions and some cool
discussion. And if anyone's listening
and wants an internship, hey, Nick, just get in touch.
I'll be sure to highlight that, at least in the podcast.
I'm not promising anything for the write-up,
but for the podcast, you get that.
I hope you enjoyed the podcast.
If you like my work, you can follow Link Data Orchestration
on Twitter, LinkedIn, and Facebook.