The Changelog: Software Development, Open Source - Open source meets climate science (Interview)
Episode Date: January 31, 2020Anders Damsgaard is a climate science researcher working on cryosphere processes at the Department of Geophysics at Stanford University. He joined the show to talk with us about the intersection of op...en source and climate science. Specifically, we discuss a set of shell tools he created called The Scholarref Tools which allow you to perform most of the tasks required to gather the references needed during the writing phase of an academic paper. We also discuss climate science, physics, self hosting Git, and why Anders isn't present on any "social" networks.
Transcript
Discussion (0)
Bandwidth for ChangeLog is provided by Fastly.
Learn more at Fastly.com.
We move fast and fix things here at ChangeLog because of Rollbar.
Check them out at Rollbar.com.
And we're hosted on Linode cloud servers.
Head to Linode.com slash ChangeLog.
Do not underestimate the power of the independent open cloud for developers.
Yes, I'm talking about Linode.
Linode is our cloud of choice, and it's the home of Change for developers. Yes, I'm talking about Linode. Linode is our cloud of choice
and it's the home of changelog.com.
What we love most about Linode
is their independence
and their commitment to open cloud.
Open cloud means being unencumbered
by outside investment
and maximizing value for the community,
not shareholders.
And that's exactly what Linode represents.
No vendor lock-in, open at every layer. If you want to learn more, head to linode.com slash open. Again, linode.com
slash open. What's up? Welcome back, everyone. You are listening to the ChangeLog, a podcast
featuring the hackers, the leaders, and the innovators in the world of software development.
I'm Adam Stachowiak, Editor-in-Chief here at ChangeLog.
Today, we're talking with Anders Damsgaard, a researcher at the Department of Geophysics at Stanford University.
He's working on cryosphere processes.
Our conversation with Anders focused on the intersection of open source and climate science, specifically a set
of shell tools created by Anders called ScholarF that allow you to perform most of the tasks
required when gathering references you need during the writing phase of an academic paper.
We also talk about climate science, physics, and why Anders isn't present on any social networks. so shout out to brian zellup for requesting this episode definitely wouldn't have come across
anders or his work had it not been for brian submitting this show request on change.com
slash request thanks brian hope you enjoy this conversation we have anders here a researcher
who's working on cryosphere processes such as glacier sliding, sediment mechanics, and sea ice deformation, many of which words I don't even understand the words, let alone the research.
So, Anders, thanks for joining us.
Thank you very much you scratching your researcher
itch and dealing with pesky internet websites and difficulty of gathering academic papers.
Let's start off with your academics in general. Tell us about what you do,
what cryosphere processes are. Help me out here.
Yeah, fair enough. So I like to do research in kind of the Arctic setting. So I'm tools at our disposal. And some of the best tools
that we have are numerical models for glacier and ice sheet flow. So these are computer models
on a quite big scale. And different things go into the models, for instance, the temperature
budgets, precipitation patterns, and ocean change. And then you get the behavior of the glacier or
ice sheet out as an end result. And that can inform you a little bit about how different
environmental factors and climatic change manifests itself in glacier and ice sheet sea level rise.
What's interesting about this field, which again, I'm nowhere near you, and I'm just a novice,
somebody who is curious, I would just say,
is that you can sort of go back in time, right?
Like ice is essentially some level of a time machine when it comes to research.
Is that what got you into it?
What sort of piqued your interest to get into this field?
Yeah, it's definitely, it's actually a climate record also
because you have snowfall, of course, being precipitated on the surface of the ice and
then you have layers upon layers being compacted so if you try to drill through these ice sheets
it's actually going back in time and just like tree rings you can count out as individual years
of precipitation so you can go back thousands of years and understand what happens in in terms of
precipitation change and climate change also
going back in time. So that's the direct measure on the ice sheets themselves within the ice.
But you can also look at previous extents of ice sheets and see how they grew from the northern
latitudes and spread on northern America and the European lowland. And you can see how
they just drastically reshaped the Earth's surface
during previous glaciations. So we know from the past that these ice sheets can really cause
massive change on a global scale, and we're trying to wrap our heads around exactly how that happens,
physically speaking. What does the actual data collection look like? You go out there with a
measuring stick, or what kind of tools are used? Some people do that. Some people simply go out into the field
and hike with a good backpack and look at the different deposits left behind by the glaciers.
But other people also look at remote sensing data sets. So you can discern some of these features from satellite
data, for instance, visible imagery. And other people get to look at geophysical measurements.
So that would, for instance, be radar measurements taken by airplanes or satellites that fly over the
ice sheets. And you can learn a lot about the exterior and the interior of the ice sheets in
that way. It's probably good to have a lot of data points to click from
because not one piece of data will give you enough information.
You can begin to assume and obviously track, you know,
if you got an aerial view that's giving you, say, coloration or radar,
I should mention, that gives you one data point.
But having, you know, physical specimens or whether it's, you know,
data gathered through metrics or sensors, etc.
It's going to give you a full spectrum view of what's really happening there.
Let me ask you guys both a funny slash somewhat off-topic question.
You might laugh, so don't laugh too hard. What would you do
or how would you feel if a library burned down? Or a very important library.
Like the Alexandrian Library, for example?
Pick your library, whichever one's your favorite.
Would you be upset?
Yeah, of course.
So it occurred to me just in this last moment that, you know, that if these glaciers have
this kind of data in them and they are melting, basically not being there anymore, unmeasurable,
unresearchable, right?
That should be alarming,
right? Because these are basically records of our Earth's history, to some degree,
showing us our past and potentially our future based upon data. And so that should be alarming.
It definitely is. Specifically in that regard, mountain glaciers in the Alps or other places,
they are the ones that would be most affected in that regard,
because they are isolated records in remote locations. It takes a lot to melt away in a
higher ice sheet, but the smaller glaciers themselves are definitely susceptible to that
kind of risk. I'm curious what your specific work is with regard to this topic.
No doubt, as Adam said, different data points, different people doing different kinds of research.
I'm not sure if you're going out into the glaciers yourself, or you like to sit back in your computer like I would and let the data come to you.
But what specifically are you personally researching, and what are the results of that kind of research? So I work on improving the physical mechanisms
and physical parameterizations in these ice sheet models.
So I'm sitting in front of the computer
trying to understand and model the system
to a better degree than we can do today.
So specifically what I'm working on at the moment
is the fact that the glaciers
actually not really are governed by how the ice itself flows. So you probably both know that
glaciers and ice sheets move from high altitudes to lower altitudes because ice is a viscous fluid.
But it's actually not, to a primary extent,
the ice flow itself that really controls how the ice sheet moves
and why it moves in the patterns that we see.
But instead, it's actually sediments like sand and clays and gravel
and stones at the base that are very weak
because we have a lot of melt water down there.
So these sediments are actually breaking.
And what we see on the surface is actually an expression at the base of the ice sheet
of where these sediments are failing and where they are weak and where they are actually
contributing to lubricating the glacier flow. So these physics are actually not very well understood and more or less not included in
the models that we today use to quantify the sea level rise in future climate scenarios.
So the estimates that in the year 2100 say that we will have maybe half a meter of sea
level rise, maybe one meter of sea level rise.
They are based on more primitive models. And it's really hard to exactly quantify these
maybe lesser, maybe more important mechanisms such as the one that I'm working on. So we really have
to be smart about our computing. And we have to be very efficient also
because the computational costs are just very, very large.
So I've worked a lot with GPU computing
and it's really painful, I have to admit.
But there is also a lot to be gained
when you have something that's working and working efficiently.
Well, the good news is that it's 2020 and not 2012.
I don't know if you saw that movie, 2012.
I know I laugh when I say that.
It's not very funny if it actually happened.
But that movie really was kind of interesting because you had this scientist and this data and this prediction.
And they thought it was crazy.
I'm not going to spoil the plot of the movie, but bad stuff happens basically.
We don't want that to happen.
So we want someone like you in the cockpit of the software and the metrics and improving these, you said they're physical mechanisms,
is that what you said? The physical hardware? Yeah, actually the physical processes. So
one other physical process that we need to understand better is for instance how the ocean
melts the glacier. So it's not a simple matter of the physics that
happened when you put an ice cube into a warm cup. It's not really the same because you also have
turbulent mixing of the warm and cold water masses in the ocean, and you have a lot of
weird dynamics going on. And you can't obviously model every water molecule you need to make generalizations and that's where um a good
understanding comes in so we can make the right simplifications but still grasp the important
mechanisms it kind of reminds me a little bit of quantum physics too where you uh again i'm out of
my pay grade here when i talk about this so this is just from a curious standpoint but basically
if a car is moving i can predict that the car will move from this point to that point,
if I can kind of tell that it's moving. But at a quantum physics level, like you're predicting
a massive amount of possibility, so to speak, because of the way particles move at the
very, very small level. This seems very similar, where your research is sort of keying into the
particular particles to make large scale assumptions.
And in some degrees, you're saying that we're making these assumptions on sort of generalized data rather than the specific particles and how they react to, you know, small level physics.
Yeah, that's exactly right.
You have to make the right simplification so you really can do something with your insights and actually apply it to bigger purposes.
Do you also have to focus in on specific glaciers in order to do real world results?
Because it seems like, in my layman's understanding,
I would think that different glaciers and different geographies under different circumstances are going to also react differently.
And so that's probably a challenge as well.
Absolutely. And we already see that happening today.
So, for instance, in the antarctic continent
there there is one area which is immensely affected by current climate change while
other glaciers actually are growing so it's a very regional uh there is a lot of regional
variability yeah and um just because the setting is different that actually means a lot to the
outcome does that mean you need to produce different models for different geographies, or how do you attack that? Yeah, you can do test
cases for smaller subsets of an ice sheet, or you can try to model the entire thing,
maybe at a reduced resolution or something like that. But they're usually trade-offs with either
approach. It's kind of the same as if you're trying to model the ocean, you can try to
model everything with really big cell sizes and coarse resolution, or you can try to model
a smaller regional ocean basin with much higher resolution.
But with a regional model, you have problems at the boundaries because the ocean also needs
to flow across a smaller surface.
So you need to be smart about what you do.
What I find maybe, I don't know if ironic or right word, maybe unfortunate is
here we have this need for advanced technical research in order to improve
our ability to make these models to get accurate predictions and results. And we have a field of academics who are bumping up, in your case, against these technical hurdles of what is really kind of an old school, non-technical field of sharing and publishing.
And all of the things that you think these people should be bleeding edge
because they're on the bleeding edge of their research.
Well, it turns out the scientists aren't software people
or they don't have, I mean, not in every case.
And there's just like tons of academia,
I may say bureaucracy in certain cases.
Like there's lots of reasons,
same thing with government publishing of open data
that belongs to the citizens
and it's in formats that are like inscrutable. There's lots of reasons. Same thing with government publishing of open data that belongs to the citizens,
and it's in formats that are inscrutable.
That seems like a terribly unfortunate circumstance where it's like you're just wanting to do your research, right?
But now you're writing your own tooling
in order to collect the information you need in certain cases.
Is that just like the state of the world?
What's your thoughts on that?
Well, yeah.
So I specifically had the problem that it was hard to,
it took a lot of time to get all the publications gathered
that I would need to have access to during my research.
So often when you are stepping into new research areas,
you need to spend a lot of time to sit down
and really check the literature.
Yeah.
So anybody who, I would say pretty much anybody who
has gone online and tried to look for papers and so on, they've probably felt frustrated at some
point because the webpages of many of these publishers are just horrible. And so they are
extremely bloated and slow and you have to often struggle to find PDF download links because PDFs are still
a good way to just having precise copies of papers on your own machine.
And every publisher is different in exactly how they set up their web page and so on.
So I found that to be quite a struggle.
So I chose to spend a little bit of
time to kind of sharpen my tool set and build something that could help me get around all of
these issues and streamline the workflow. And I guess many of your listeners are probably the
same way. We probably all have a set of dot files, which we just very much value and continuously
tweak in order to just get rid of little hurdles
that might occur every day in our workflow.
So I'm curious if you are unique amongst your peers
in your research, or if you are commonplace
in terms of somebody who's doing climate science,
glaciology, and yet spends time writing scripts, software, programming, basically, in order
to help them along in their way.
Do a lot of climate scientists have your skill set, or are you unique?
How did you get these programming skills?
I'd say a lot of climate scientists are actually pretty good programmers.
So when I was at NOAA in the US, I was at this climate modeling facility,
and people have been writing Fortran programming there since the 60s. And they know everything
about making really efficient code. So a lot of people in that sub-niche are actually pretty good programmers and do everything they can
on the command line.
But I did my undergrad in a geoscience department, and there it's the stereotypical image you
have of bearded professors in old-fashioned shirts, and they have no idea about doing,
you know, really efficient use with a computer.
So there are a lot of different kind of groups out there that just have a very different workflow
and a very different skill set.
Do you know if there's any initiatives out there
to pair these people with software developers?
I know we see huge strides made
if you pair a developer skill set
with a designer skill set
or a product skill set with a developer skill set.
It seems like a scientist who doesn't have
the software chops that some of the climate scientists
or more technical modeling scientists have,
pairing those people with a skill set
like a software developer
could see huge gains for both participants. Is that a thing?
Yeah, it actually is in bigger projects. And I completely agree. It's very good to combine
different skill sets. So for some of the bigger modeling projects out there,
they actually hire dedicated scientific programmers to do things like automated testing
and proper documentation in the code and yeah just
things like that which make life easier and which are commonplace in modern software development so
actually something like version control is pretty recent in in the modeling community i'd say so of
course there are so many benefits to having a proper development pipelines and and so on that's that just a
complete necessity in in modern development so a lot of good things coming out of combining people
with different skills How often do you think about internal tooling?
I'm talking about the back office apps,
the tool the customer service team uses to access your databases,
the S3 uploader you built last year for the marketing team,
that quick Firebase admin panel that lets you monitor key KPIs,
and maybe even the tool that your data science team had together
so they could provide
custom ad spend insights. Literally every line of business relies upon internal tooling but
if I'm being honest I don't know many engineers out there who enjoy building internal tools
let alone getting them excited about maintaining or even supporting them. And this is where Retool
comes in. Companies like DoorDash, Brex, Plaid, and even Amazon,
they use Retool to build internal tooling super fast.
The idea is that almost all internal tools look the same.
They're made of tables, dropdowns, buttons, text inputs.
And Retool gives you a point, click, drag and drop interface
that makes it super simple to build these types of interfaces in hours, not days.
Retool connects to any database or API, for example, to pull data from Postgres.
Just write a SQL query and drag and drop a table onto the canvas.
And if you want to search across those fields, add a search input bar
and update your query, save it, share it.
It's too easy.
Retool is built by engineers explicitly for engineers.
And for those concerned about data security,
Retool can even be set up on-premise
in about 15 minutes using Docker,
Kubernetes, or Heroku.
Learn more and try it free at retool.com slash changelog.
Again, retool.com slash changelog. So you were bumping into this problem with your research
of going out and getting the publications,
PDFs in most cases, probably almost all cases,
and you wrote in your post about the announcement
for the Scholar of Tools that the common tasks you're doing include downloading PDFs
and publications, getting references into your bibliography.
And you said, however, I'm not a fan of navigating the slow, bloated,
tracker-filled, and distracting webpages of academic journals
and publication aggregators.
And so you came up with this solution.
Why don't you tell us about it?
Yeah, so I've been working on something similar for a little while and
I just decided to properly wrap this up so it could be more useful for me and maybe others as
well. And that's what kind of gave birth to the ScholarRef tool set. So it consists of a few tools
which can be chained together. So they're written in an old school way.
So they're based on the Unix philosophy.
So everything should be text-based
and each tool should do one thing and do it well.
So the idea is that there are different tasks
that are common in this kind of work.
So one task is, as you mentioned, Jared,
you need to get a reference
for a publication. So say you read a nice paper and you want to cite it in a paper that you're
writing. So most journals use something called LaTeX and you would need to get something called
a BibTeX reference. So the traditional way is to go to the journal webpage, download text reference in this BibTeX format,
which is kind of like JSON, but not exactly.
And you would then put it into your own bibliography,
which is just a massive text file,
paste it in there,
and then cite it from your LaTeX document.
And the problems with that kind of workflow are actually many.
So as you mentioned, I'm not a fan of general web pages, as I wrote in the blog post. But also,
there are more practical issues in terms of the formatting of these references. So even though
they share this common BibTeX format, they're actually very different in content. So for instance, the author first
names might be written out or they might be abbreviated. And often journals that you want
to submit your own paper to will only accept one type of author styling. And the same goes for the
journal name. That also needs to have a consistent style.
So there are a lot of things that you need to go through once you get a hold of this reference from the journal webpage.
So I found out that there's this API publicly available, which I can simply query from a search query.
So that actually works really well. And alternatively, I also make it possible to just feed in a PDF document.
So one of these tools in the ScholarRef tool set
will try to extract the DOI,
which is the unique identifier for that publication.
So you're not getting any kind of wrong results.
So the last thing you want is to get the reference
for the wrong publication.
And even worse is if you don't manage
to actually correct that mistake before submitting.
So I've tried to make these tools kind of modular.
So in some instances,
you might only need the DOI of a publication.
In other instances,
you might need the full DOI of a publication. In other instances, you might need the full reference
in a consistent format.
And finally, in some cases,
you might actually need the PDF itself.
Another benefit from keeping these tools
really minimal and simple
is that they are quite portable.
And so they have very minimal dependencies
on the host system, which makes them
easier to distribute. And secondly, it also makes it possible to work with them from your favorite
editing environment. So for instance, I'm a guy that likes to stick to old school terminal text
editors, VI specifically, and there can simply bind a set of keys and get the reference in a very
convenient manner. And you can do that in pretty much any editor that you can think of, because
these are just shell scripts. So that just makes them much more portable than many other solutions.
You can tell that you're a VI person because in your editor integration section under Emacs,
it says, don't know, figure it out yourself.
That sums it up.
I like that, it's a good response.
Yeah, but most people know how to do that kind of stuff anyway,
so it's kind of just teasing a little bit.
So there's these three main tools,
and they're all under the Unix philosophy.
You wrote, they're just shell scripts, and you talk about the importance of posix i thought maybe it'd be a good opportunity for
those who aren't aware of that if maybe you explain why that matters what posix means
and then of course why why shell scripts when you could reach for more powerful tools nowadays just
curious your thoughts on your own tooling sure soOSIX is a standard which really exists to create a
common platform for these computing interfaces. So back in the day, you had quite a few different
Unix variants, and many of these would have their own kind of implementations of basic tools,
and they started to differ in options. And for instance, the GNU tool set, which is
common now in Linux distributions, is really expanded with a lot of command options and so
on that are not necessarily the norm on other systems. So if you want to have something which
is really portable beyond just a specific environment, you need to adhere to the broadest common standards.
And those would be the POSIX standards for scripting
and for the different tools in the Unix environment.
So that's why I chose to go with those tools.
And also something like just POSIX shell
is much more rapid than a more complex shell interpreter
like Bash or ZSH or something like that so
by keeping it minimal and adhering to these strict standards you should also get very good performance
besides supportability so that's a definite advantage so i didn't get a chance to look at
the source code and i'm curious you know how large you know lines of code, how big are these individual tools?
At a certain point, POSIX shell can become unwieldy.
It's a sharp tool, you can cut yourself with it.
I know myself, having cut my teeth in Perl and Ruby,
I'll start with just a shell script
and then anything beyond 10 lines
and I'll go reach for a scripting language.
Sure.
And so I'm curious how complex he's got.
Oh, these are quite simple.
Most of the source code is actually helper things like the help text itself
and version info and so on.
So the scripts are really minimal.
But you're absolutely right.
If you want to do something complex, you should go for Python.
But then again, something like the lag just from
starting a Python program is quite significant. You mean startup lag by the time you type
the command in? Startup lag specifically. And of course, if you want to do anything
which has a lot of iterations, which these tools don't, but if you wanted to do that,
then Python is not ideal, of course. But yeah, you really need to pick the language
appropriate for the job.
And shell scripting was the appropriate thing for this,
I think.
The other decision that you have to make,
which you have made,
is when to formalize a project
and make it a public thing,
make it an open source available thing.
Many of us have random scripts
laying around our machines.
I know I've written plenty of things
that'll never see the light of day.
Sometimes you write them just enough to take the pain away,
but not enough to take other people's pain away.
Then you never get the limelight.
You never get to come on the changelog, but you have your own little scripts.
I'm curious, what was the process of writing these?
Did you have it formalized for a long time? You just didn't have the help text? and I'm curious how long, what was the process of writing these?
Did you have it formalized for a long time?
You just didn't have the help text?
What made you push over there and say,
okay, this is very useful for me.
This can help out hundreds, thousands,
who knows how many other researchers out there feeling these pains.
I'm going to go ahead and put the extra effort in
because even the thing that's almost ready for the public
is not ready for the public.
A lot of the window dressing, so to speak,
is the effort, as you probably realized in this project
if you haven't had other open source projects.
So when did you decide to formalize it and take it public?
Well, to be completely honest,
I had just put up a new web page
and I needed something to put on my blog.
But you're absolutely right.
I love that honesty.
It takes a lot of time
to kind of wrap things up
and present things
in a coherent manner
and to make just things
nice and presentable
and adding niceties
such as help text and so on.
That takes a lot of time.
And you're absolutely right.
I really want that.
It would be great to see
that people openly share
kind of little niceties that they've carved out for themselves
because there are so many clever things people invent out there
and it's just a good idea to share that.
The one interesting thing I think here with this conversation
and specifically what you've written here is that it's not on GitHub.
Yeah.
So I almost thought you'd release this so that you can declare and advocate for self-hosting
your own Git, which I'm really curious why you're doing that.
Your next blog post.
And I would say even a follow-up to that would be, you know, for those listening to this
thinking, I want to get into this or I want to check it out or reference it to my research
friends or whatever it might be, you know, would they be disappointed to see that it's self-hosted Git and there's no collaboration
or seemingly no collaboration because it's not on GitHub. And GitHub is generally
social and you're against social networks to some degree.
Before you answer that, Anders, I'll just say as a casual observer who's interested,
probably not going to have to use your software, but was like, oh, this is cool.
I admitted freely a few minutes ago, I haven't read the source code, and
if it was on GitHub, I would have by now, because it's a
click away, whereas this is a git clone away.
Just another step. I would have been able to
click onto the files and see in your scripts and
maybe learn a thing.
That adds some salt
to the conversation here.
This is self-hosted git. What's the
decision there?
Sure. Well, to address your comment about clickability,
I actually have a web frontend for the Git host that I have.
So it's a little C program that writes the repositories out as HTML,
and you can also actually go and click and look at the source code and so on.
Touche.
It's actually pretty cool, too. It is pretty cool.
I didn't see that.
I'm seeing it now.
Nice styling on this as well.
I can almost see this as being a kind of a,
like if somebody took this and said,
let me do a CSS restyle of GitHub,
where you can sort of, what is that called?
Like CSS styles or what is that?
What's that style sheet replacement? The old Zen Garden, CSS Zen Garden. Well, not like that, but where you can actually of, what is that called? Like CSS styles, or what is that? What's that style sheet replacement?
The old Zen Garden, CSS Zen Garden.
Well, not like that, but where you can actually put it into Chrome or Brave or whatever browser
you use where you can actually restyle a website.
So you can make GitHub look like this.
Oh, yeah, yeah, yeah.
Yeah.
This is very hacker.
GitHub is not very hacker these days.
Firefox had a whole scripts thing.
What was that called?
Probably still out there.
Our listeners are hating us right now.
Sure. Anyways, user styles? I don't know user styles there you go good job okay so uh touche anders you
got me on the on the clickability i'm just now seeing that in terms of collaboration there is
not a lot of the github niceties such as yeah issue tracker there is not you know open pull
requests and things like that.
None of that is there.
And in the readme for the ScholarRef tools,
I invite people to contribute changes by sending patches over email.
So that's the old school way of doing it.
Wait, hang on.
You and Linus.
Yeah.
So you want to modernize the way that research is done,
yet you want to rewind time when it comes to source code control and just the way that software has kind of evolved for collaboration.
Well, modernizing does not necessarily have to mean
that you put a fancy GUI on top of everything
and just put lots of JavaScript and CSS
just to completely drench whatever information
you're trying to convey.
Modernize it.
Is that how you describe GitHub?
No.
I'm just pulling your chain.
I guess what I mean by that is that it's sort of socially acceptable
that GitHub is not so much the way,
but it is a way to collaborate, and it's a collaboration tool.
And so you're sort of going against the social norm here.
Sure.
Well, I definitely am open for collaboration on this project
and any of my other software projects, which used to be on GitHub,
but I pulled it down because I was not really happy
with where GitHub was going as a company.
And thinking about the web as a whole
and where the web is going,
it doesn't really make sense
when you think about the architecture
that everything should become more and more centralized.
I think the web should really be a distributed thing
and there is nothing with my source code and so on
that stops you from just copying everything
and starting on your own on GitHub or wherever.
Because everything is very liberally licensed,
more liberally than the GPL for that matter.
So yeah, I think there are lots of reasons
to look beyond these mega source code databases
because it doesn't have to be that way.
And I don't personally see the massive benefits of that.
Has there been any thought, this is totally a tangent, so forgive me, but has anybody
or have you put any thought into this idea of, you know, the traits and attributes that
feature set, I suppose, that GitHub offers in a decentralized way where you don't have
to have a, you know, this kind of repository where it's, you know, controlled by a large corporation. In this case, you know, previously ran by, you know,
tried and true hackers that sold their company to Microsoft, which isn't a bad thing. You know,
it's not, there's nothing against that. It's a choice everybody makes. So there's trade-offs,
but the point is, is that I can recall a day when GitHub was ran by, I'm sure it is. Jeez,
I'm seeing negative things here. GitHub's great.
People behind us are great.
It's hard work.
I'm not trying to shame anybody here,
but it is now owned by a corporation
and not by three hackers
that got together for beers anymore.
It's grown, it's changed.
And so that can't be glossed over.
It's not the way I suppose it is these days.
How do you get to a point
where you can kind of have your cake
and eat it too, so to speak,
when you want to have your own sofas to get
but provide collaboration opportunities
the way that GitHub has socially normed collaboration
when it comes to source code and open source?
Yeah, well, of course, it'll always be a trade-off.
So because you kind of leave many of these GitHub niceties behind,
you might also put yourself at risk with damaging any collaboration that might otherwise present
itself. But then again, if people are interested enough, you know, that you get pull requests
which just go beyond basic typos, correcting basic typos and things like that. If people want to contribute
and really make something
significant and provide
some significant changes
to a software project, I think they'll
get in touch no matter what the
communication platform is, to be honest.
I have a couple of thoughts on this. First of all, we should
note that GitLab began because
of this. It was self-hosted. It was supposed
to be federated with regards to collaboration.
There's been Gitorious.
There's been other software projects that are basically like,
what if we had GitHub's niceties without the GitHub?
Of course, GitLab has turned into another large corporation
that services the enterprise.
And I think so far their open, social, public side
has not taken off like their enterprise side has,
or like I should say GitHub's has.
Many hackers out there run in their own instances of Git,
for sure, and they just make that trade-off.
I do agree to a certain degree, Anders,
where what GitHub and really the bringing together
of all the developers in one place
has provided for projects is visibility
and casual contribution niceties.
But software collaboration was happening before GitHub.
You just didn't see it because it was emailing patches around.
It was behind the scenes and it wasn't quite as public.
I think to a certain degree,
people who are serious and benefiting from the software
and already understand Git tooling,
they can get around the hurdles that is
this is not a non-GitHub setup.
That being said, you're probably missing out on some people
who would contribute but aren't and may start off
as a casual contributor and turn into a more serious one.
I was curious, Anders, if your move off of GitHub
was around the Microsoft purchase
or if it was before or after that,
if that was like a major contributing factor
in your move away.
Yeah, after the Microsoft purchase,
I started looking around and actually was on GitLab
for a little while.
But, you know, just realizing that these are corporations,
they are acting in their interest of shareholders,
which corporations should totally do.
That's kind of the thing.
But you don't have to constrict yourself to that framework.
So just thinking about the alternatives out there
with self-hosting and so on, why not?
That was just kind of my reaction to that.
And I looked into what it would actually take
to set up something like that.
And it was extremely easy to get up and running.
So I just went with it.
And so far so good.
So is the C program front end that you're using,
is that open source and available?
Did you build that or did you find it?
I didn't.
Yeah, so that's open source.
It's called Stargate, S-T-A-G-I-T.
And it's a very minimal program
very nicely written also.
So one thing that's fascinating of course
we've tracked the acquisition of GitHub by Microsoft
since the day it was announced
and we've had many people on the show since then
and different reactions
and my personal reaction thus far
it was kind of a wait and see
and now we're a couple of years into it
and I think in terms of the product that is GitHub.com has improved dramatically since then.
I feel like the relationship between the corporation and the community has improved
in many ways since then. So I've seen mostly positives from that acquisition. That being said,
there are casualties along the way. Adam, I interviewed Ned Batchelder for our Maintainer Spotlight series.
Ned told me a story of a guy who was like,
I told you about him, he was like this traveling contributor.
Remember that? He would pick a project for three months or so
and he would contribute to that project heavily, not casually.
He'd get all into it, he'd make major contributions,
and then he'd move on to the next project.
It's just the way he did open source, which was very
unique. And his name
is Loic Dashery, I think
is how you say his last name, French fella.
And Ned had
benefited from his contributions and
was just kind of singing his praises, and I said, we should get
Loic on the show. And the other day,
I went looking for Loic, and he's not
on GitHub. And I couldn't find him. And the other day I went looking for Loic and he's not on GitHub.
And I couldn't find him.
And I found his website and he left GitHub.
Similar to Anders, he closed his account and his was specifically the day Microsoft acquired GitHub.
He was gone.
And so I'm curious if he's still doing open source.
He's not doing it on GitHub like he was.
And so there's definitely been downsides along the way
I think similar in nature
in the way you might do what Anders does here
which is research glaciers
for this deep data
you can sort of hypothesize where things
are going based upon
past, present and potentially the future
people are doing similar
aspects of that towards
open source and then also
GitHub.
Because, you know, what suffers from this is, I would say, the improvement to software
and then as an effect of that, the human race, because our lives change and, you know, get
better or worse because of new software in our world to do different things, or in this case, do research, you sort of, you sort of get to this point where, you know,
the loss really is at the open source level. You know, GitHub is there trying to do one thing,
and this is totally not even a GitHub show, gosh. But anyways, you know, we're sort of in this mix
here where you have this sort of love-hate for this corporation that owns it. And I'm kind of
with you, Jared. I didn't have the same opinion at first, but when you said, let's wait and see,
I said, you know, I agree.
Let's wait and see.
And I think most of the things that have come from it
have been fairly positive.
But what you see is,
and I'd love to talk deeply with Andrews
and others like Loic to see like,
what specifically has their open source life been like
since leaving GitHub?
And is it worth the loss that the software
slash community slash open source would benefit from,
you know, to leave, to not participate?
Because everyone's there.
I mean, there's a lot of people in ways that everyone, it's only in masses, not everyone.
Like, yeah, the critical mass is there.
So, you know, you, Anders, and others have decided to not participate.
Sure.
Well, to a personal, on a personal level, it really is about control. So for instance, say that
you have an Android phone and you upload all of your photos to Google Photos. Once you're in that
kind of framework, it's really hard to migrate away from it once you've invested in it. And of
course, you can always clone or push your Git repositories elsewhere when you have a
local mirror, but modern software development on GitHub is not just the code.
It's everything around it that we discussed.
It's the issues, it's the pull requests, and the wikis, and all of that.
To my knowledge, it's a far bigger issue to move that around to a different platform in
the case that the GitHub corporation and Microsoft
decides to take the platform elsewhere.
So it's about keeping that control,
and it doesn't have to be the way where you just give it up
to this corporation,
but actually keeping it to yourself
is kind of an advantage in my opinion.
Which is the beauty of Git and distributed version control,
is that as long as you don't extend into the full feature set
or you're willing to give up certain aspects of the feature set
like GitHub issues, you are still in control.
They are hosting a version, a snapshot,
many snapshots of your code over time.
But as long as you don't couple yourself
to that incorporation, you can always go get stuck what's
it called stag it you can always go set up your own deal because we have this you have your you
have everything and uh moving away is feasible maybe it's going to be more painful the more you
buy in but it's still feasible at any moment and so when when something bad happens, you are free to leave.
Whereas with other things, they have everything, right?
They have your data.
You don't have your data.
They have it.
It's on their computers, not yours.
At least with Git, it's on both computers.
There's copies.
That's actually the exact beauty of Git is that there is multiple copies and someone can recover it should something happen to one of the versions of it elsewhere
or the nodes of it elsewhere or copies of it elsewhere.
I'm curious, Anders, are you,
this is maybe going one more layer deeper,
are you against having your code on GitHub?
So did you wipe all of your code away?
Did you just sort of like just vacate and stop being there?
Or did you delete?
I deleted for the purpose of not confusing people
that were interested in specific projects.
So I specifically shut down everything to the bare minimum pretty much.
So for me, it's not just about the code itself,
which of course you can have distributed
among multiple platforms at the same time,
but it's also about providing access
patterns to the platform.
So you know the saying that if you're not paying for it, you're not the customer, but
you're the product being sold.
And GitHub is not providing this platform to the users out of their heart's goodness.
There has to be some kind of money involved, of course, because it's a corporation.
And so the wheels have to turn
and they're making money somehow,
just like Facebook and the others.
Well, it's worth differentiating GitHub from Google
with regards to their business model.
So GitHub business model is more straightforward.
Like it's a freemium model versus an advertising model.
GitHub is making money off of their more power users
and organizations, paying them monthly
a certain amount of dollars in order to have more features
versus Google where everything is free
and they're making it via advertising.
So it is different in that regard
insofar as you are the customer if you are a customer.
It's a freemium model, so they give it away to people who they want to eventually
become the power users. So it's a little bit less behind the scenes
in that regard because they're business models more money for features.
Sure, absolutely. I agree with that.
Have you heard of our newest show called Brain Science?
Yes, Brain Science.
It's a different kind of show, I know.
And it's probably one of the ones that reaches the furthest out from our typical listener audience.
But this podcast is what we call For the Curious.
And what's cool about this show is we're exploring the inner workings of the human brain to understand things like behavior change, type of formation, mental health, and pretty much what it means to be human.
If you've ever thought about why you do what you do or why others do what they do, then this show is for you.
Head to changelog.com slash brain science to listen, subscribe, and learn more about this awesome show. Here's a preview of a recent episode called One Small Act of Kindness,
talking about empathy and mirror neurons.
So it sounds like pliability and flexibility is a pretty crucial role too in relationships,
because if you're not flexible, bendable, pliable, whatever, however you want to phrase that,
if you're rigid, right, that's only're rigid, that's only going to be difficult for you to flex, to enable change or to what you've said before, recalculate.
Accept new data, analyze that data, make a new plan and iterate towards a new action.
Yeah.
And so one of the other things involved with this flexibility would be what
researchers have discovered as mirror neurons. And so mirror neurons are these neurons within
the brain that help us sort of get access to another person's emotional experience.
And so there's an action component in it that it was first discovered actually with
monkeys and the sort of mimicry that occurred by watching somebody else do an action well in the
same way i can sort of watch somebody else walk through something in terms of an emotional
experience and if i'm holding space for them in my mind, like my body physiologically,
these mirror neurons come to play. All right. To keep listening, head to changelog.com
slash brain science slash nine. That will take you to the episode titled one small act of kindness.
Mariel and I dig into this thing called empathy as a construct. We ask questions
like what key brain structures are involved? How can
we better understand empathy to be able to better navigate ourselves and our relationships with
others, both at home and in the workplace? It's a deep subject, a very fun subject. Again,
changelog.com slash brain science slash nine, or search for brain science on your favorite
podcast app and subscribe. We'd love to have you as a listener.
What are the ideal users for this tool?
I mean, if we're looking at who could become a drive by user, not a contributor, that's
that's a more difficult path, as we mentioned.
But, you know, a user seems a little easier.
Get clone, that's pretty possible.
But, you know, who's using this tool and, you know, what's needed in this space?
So I'd say the typical user is probably not afraid of the command line,
specifically because these are shell scripts.
It kind of takes a little bit of fiddling to make it work
with whatever you're editing your manuscripts in.
So people would probably be familiar with a little bit of coding.
And many people are that today within academics and so on, especially in the technical sciences.
So I'd say if you're not afraid of the command line, give it a go and see what happens.
What do you say to the idea of, say, climate science dabblers, those who might be like, I'm a curious person.
We have a show called Brain Science and I'm brain science curious at least.
And I've actually listened to quantum physics books and I've listened to and I say listen to because I've listened to books a lot more than I read books.
But it's still reading in my opinion. I've listened to large-scale lectures about actually how time works,
how we travel through time and the actual physics of time.
So I would say I'm maybe in that wheelhouse,
although I'm not really digging into climate science.
But for those who might be similar to me or somebody who's curious like I am,
they might come across not so much this tooling but the space,
the need for more
brains in such a important space to say, you know, if the sea levels rise by what you said
earlier, which was a half a meter in the next 100 years, well, that's a problem.
Yeah, definitely.
And as long as people have an interest like yourself in kind of sciences and what's going
on there, there are definitely a lot of contributions that can be made by people like you. So for instance, a lot of the modeling tools
that are out there under open development, I'd say the far majority of them are developed openly.
And a simple contribution to some of these models could be to take a look at the source code and just check it out and get a feeling for
what's going on. And they often have quite good documentation also that helps developers or
people that look into the source code to try to understand how it all works. And maybe you find
something that's missing. Maybe you find something that you as a person with your background could see could contribute to maybe the development or maybe the code itself.
Maybe a way of optimizing some kind of algorithm.
Maybe you know a lot about a certain set of test tools.
For instance, it's pretty easy to get things up and running on Travis and similar CI platforms.
So it doesn't necessarily take a lot of effort to get different models up and running on these
testing frameworks. And that allows the developers behind these models to really make much more
clever developments as they go because they can see if the intended changes
do the correct thing that they wanted to do.
And so there are a lot of things
that maybe people with purely developer backgrounds
can contribute just from their skill sets
to these kinds of models and communities.
So we're all open arms in that regard.
And I think people would be very well welcomed to that community.
That's good to hear because sometimes when you get into certain fields,
I've heard this at least from Muriel, co-host on Brain Science with me,
about the brain science field, the neuroscience field, that it can be kind of...
Clicky?
I wouldn't say... Her words weren't cat catty but just sort of like arrogant i suppose
maybe these are just putting words into her mouth i don't think she said this verbatim but
but basically the the effect was that you know i know a lot unwelcoming yeah not very welcoming
because there's a lot of specific research and specific opinion formed from research and a lot
of gatekeeping so to speak when it comes to entrance and participation
and, you know, even credential checking, like which letters do you have after your name?
Okay. You're not welcome kind of thing, you know, similar. And so that's not the case here
or not so much, or it is, but it's not so much. I'm always seeing that based on his face. We got
video here. So I'm seeing Andrew's, I'm seeing Andrew's face as I speak. He's like, well,
there kind of is just with his face. So use your words.
There can definitely be sharp elbows
between maybe different models,
you know, the competing models
and people want to do, you know,
make the most precise model out there and so on.
And there will never be a precise model
just from the nature of the problem.
But there is a lot of competition and so on.
But if your intent is to just provide a positive contribution to a project as an outsider, you would be very well welcomed.
I'd say you're not in there to break something.
And if you're trying to, you'll probably be told off in a nice way.
Right. Are you trying to be right or are you trying to solve the world's problems?
Exactly.
If you're a scientist trying to be right, well then, your right may not actually be the right.
Yeah, exactly.
There's another saying,
all models are wrong,
but some are useful.
Okay.
I like that one.
There's another saying,
all models are wrong
except for mine.
Mine's correct.
Go ahead, Jared.
So,
maybe as a closing topic,
which is,
there's no clean segue to,
but I think will be
enjoyable regardless,
is that you have published some really awesome photography on your website don't you're not going to find it
on flickr you're not going to find it on instagram well maybe it's on instagram i don't know you're
not on social networks but you're going to find it on anders's website which we of course have
linked in our show notes and before we started recording we were talking about the need for an analog or for
something completely different and while you're not out in the field gathering this glacier data
you're sitting at your computer gathering data you still get out and you take photos so tell us
about that hoppy and how you got to i mean in my layman's opinion you're really good at it like
your photos are really high quality. Tell us about it.
Thanks.
I appreciate that.
Yeah, so I think it's very important to have something which is distinctly different from
your day job.
And most of my hobbies are also computer-centric.
So I decided that I needed something in my life which is analog and away from the entire
thing.
So I started getting into photography, buying the latest megapixel camera monster
and then i ended up sitting in front of the computer doing post-processing
what's what's the megapixel monster you're working with what is that oh it was a sony a7r
asmr oh it was just it was the original one gotcha well that's an amazing camera it is
but i sold it because i ended up just sitting at the original one gotcha well that's an amazing camera it is but i sold it
because i ended up just sitting at the computer doing you know spots healing in photoshop anyway
so now i'm using film cameras and i'm developing pictures in in the dark room and in a spare
bathroom that we have oh that's cool analog in the red light lights like a dad from the 70s or
something like that yeah yeah it's a lot of is it exciting so
the thing about i mean i only know that world from the movies and in the movies you know they
they got the shot but they're not sure if they got the shot and it's going to reveal something
that's integral to the plot and they have to wait like the waiting to see if you really got what
you needed is that is there joy in that or is it just annoying oh there definitely is a joy to that
as long as you're not a press photographer or something like that i suppose but right um Is there a joy in that or is it just annoying? Oh, there definitely is a joy to that,
as long as you're not a press photographer or something like that, I suppose.
But it kind of forces you to be very deliberate
and also forces you to be very methodological
in your image capturing process
because you have to get it right
and you're spending money every time you click the shutter
because you're running through film.
But it's a slow kind of processes which involves your physical presence you have to
make sure that the chemicals are mixed right and stuff like that you can experiment with a lot of
different things in the process so it's a lot of fun and it's nice to have a complete break from
from the one of your series that i liked a lot was the one on patterns and i
think that's what's interesting in this world is is uh i noticed this when i got a drone and sort
of doing a little bit of aero photography is that these every day you know like right outside your
home just go outside 100 feet away from your home or 100 meters whichever system you use pick a
length and you know if you can get 100 feet up or 300 feet up above the ground,
you'll see something very different than you will see on the ground.
And I love that about the world, how there's just like patterns and unique
things, or when you zoom in at a very micro level, how things look differently
than say, you know, just a few feet away.
That's so cool.
I love this gallery that you got going on here.
It's got the different landscapes and stuff like that.
It's pretty cool. Thank you very much. Yeah, it's definitely healthy to look under the
surface of just everyday things or just, you know, very far away from the surface to get a new
viewpoint. And it's healthy for the mind, I think. That's also why I chose to go with black and white
because it's just a different way of seeing the world that's not common to a normal vision.
Yeah, absolutely.
What advice would you give then to those listening who are like, I don't have an analog?
How did you find this analog for yourself and what has been the benefit to your career?
Or just, I guess, just less about career focus and more just your life.
How's it changed your life?
Well, I just wanted something which got me out of the house basically and it kind of allows me to take something home from when i'm away from my usual environment so people looking to to do some kind
of new hobby for themselves which could be some kind of analog thing maybe just try to explore what makes you happy and what kind of clears your mind from the usual churn.
So just try out a lot of different stuff and see what sticks.
Jared, I want to point the question to you or I guess somewhat of a statement, I suppose, and you can respond.
But what I've appreciated is what I assume is one of your analogs, if not many of them, is your love for riding on a tractor
and planting trees and taking care of bees.
That's such a cool thing.
So what's your analog, Jared?
Well, you just said it.
I mean, I have a lot of things I do.
I mean, I do a lot of things in the real world.
I play basketball a couple times a week.
I coach sports, youth sports.
But as far as like, I like to get my hands
like in the actual
dirt. And I didn't know this about myself until I accidentally bought an acreage a few years back,
which is a longer story, not not going to tell here today. But I love planting trees specifically
and nurturing them and watching them grow and thinking about, I probably planted on our land,
maybe 400, almost 500 trees over the last four years.
And just thinking about what they're going to be like 10 years from now, 20 years from now,
and then even after we're dead and gone, that heritage or that legacy is really cool.
And I never would have thought I'd be into that kind of a thing until I got out and started doing it.
And I'm like, wow, this is really, really enjoyable.
So that's me.
The dirt.
Well, something cool about the dirt is actually it's alive.
I don't know if anybody can speak to this, but there's a lot.
Our soil, so to speak, is alive.
There's a lot of living organisms in there, microorganisms that are very, very important.
And what we do in today's society is basically covered up with cement.
It's like here in know, here in Houston,
we call it the cement village or it was a cement.
Concrete jungle.
Yeah, concrete jungle.
Because it's just, we just are fascinated
with like just covering up our amazing soil with cement.
It's terrible.
It goes back to that Counting Crows song.
They paved paradise and put up a parking lot.
Oh man, there you go.
All right.
That's terrible.
So, Anders, one last question on your analog,
because I don't understand this world at all.
When you switch to the film then,
and you're doing them in the darkroom,
how do you then get them back into digital format
to put them on a website?
Do they scan, or how does that work?
Yeah, I scan them.
So that's kind of the easiest way to do it,
to deal with reflections and so on.
So scanning is the way to go.
Gotcha.
Just wanted to ask.
It's always, you know, they got us locked in, right?
These machines, these machines.
Yep.
You know, it's how we're communicating now.
You finally escape and then you go digitize it
and put it back onto the machine.
Well, at least you're minimizing that, right?
I mean, you're minimizing the amount of exposure
to the digital world,
which I think is pretty important.
It's an interesting thing to even think about,
minimizing your exposure to,
I guess, limited data, right?
Because even here, we're not,
we have a lot of data informing us
about this conversation
and these relationships in this conversation.
I can see all of you, we're on Zoom
and we can see each other's faces,
but it's still a limited data set.
We're not in the same environment.
You know, we don't hear the same cars
or things passing by.
We don't hear similar things.
There's a lot of things we're sharing,
but not a full spectrum of data.
And I think that's kind of bad.
You're misinformed about the life you're living
and who you're living it with.
There's more data to be had.
So what you really want is smell-o-vision.
You want to be able to smell what I'm smelling right now.
You had to go there.
We're too far in.
We're too far in.
Yeah.
What do you think about that, Andrew?
The limiting of, I suppose, digital exposure.
Is that a thing you've thought about?
Yeah, definitely.
For instance, I'm trying to minimize my cell phone usage
just at the moment.
And I think it's just healthy to step away.
And we didn't evolve to be constantly stimulated
by electronic devices.
So our brains and our, yeah, just mental presence
is really suited to slower encounters
and to more in-depth conversation that you might
get from flickering through a social feed or something like that so it's probably worth you
know just stepping away a little bit and feeling feeling yourself and and engaging other things as
well what have you read to inform that opinion of we haven't evolved to be stimulated by digital
devices is that just an opinion or is
that how did you is there any book what i'm really trying to get is there any book you can reference
that you've read and you're like man this blew my mind and you know this is not how we should evolve
and you're you're on a you're on the street corner you know on a soapbox or something like that
no i'm too busy with with research to really read much extra than that. But just thinking about the timescale, so we are several millions of years old as a species
and the digital revolution is not even a blink of an eye.
So there's no way that our physical form
really has adapted to any of that.
So definitely makes sense to just take it slow sometimes.
Well, clearly we can keep piquing our curiosity and go deeper
and deeper, but let's leave it there.
Anders, thank you so much for
doing what you can to limit
your exposure to this digital world, but still
sharing this
unique tool set and your love
for photography and
your care for
the future of our world, which could be
resulted in more water or less.
It would be terrible if there was more, though.
We need to keep the ice where it's at.
Thank you very much for having me, Adam.
All right.
Thank you for tuning in to The Change Log.
If you aren't subscribed to our weekly newsletter,
you're missing out on what's moving and shaking in software and why it's important.
Hate email newsletters?
Fun fact.
Killthenewsletter.com was created by someone just like you
who wanted ChangeLog Weekly so bad,
they wrote a program to subscribe on their behalf.
And of course, it's 100% free.
Fight your FOMO at changelog.com slash weekly.
When we need music, we summon the Beat Freak, Breakmaster Cylinder.
Our sponsors are awesome.
Support them, they support us.
We've got Fastly on bandwidth, Linode on hosting, and Rollbar on bugs.
Thanks again for listening. We'll talk to you next time. Thank you. It's my slogan for when I run for office.
We need to keep the ice where it's at, okay?
And everyone will adopt that.
Exactly.
What's funny is your seemingly antisocial move
of not being on GitHub,
which you'd think would reduce your visibility,
has actually backfired,
and now you're getting visibility because of it.
Yeah, it's terrible.
Because the tools are cool,
and when I first went to your website,
I thought, this guy's interesting.
I watched your video of your research.
I'm like, all right, interesting guy.
And then I went back to Brian's request, and he's like, and he's on self-hosted Git. And I'm like, oh right, interesting guy. And then I went back to Brian's request
and he's like, and he's on self-hosted Git.
And I'm like, oh, so there's kind of like two facets.
And I was like, okay, that's a good guest right there.
And so that aspect of it, which was antisocial,
became social.
Cool.
Yeah.
So sorry, it backfired.
It totally did.
And now I'm on a podcast.
And now you're on a podcast.
Exactly.