Python Bytes - #211 Will a black hole devour this episode?
Episode Date: December 7, 2020Topics covered in this episode: Introducing FARM Stack - FastAPI, React, and MongoDB py-applescript airspeed velocity visidata Extras Joke See the full show notes for this episode on the website... at pythonbytes.fm/211
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 211, recorded December 2nd, 2020. I'm Michael Kennedy.
And I'm Brian Ocken.
And we have a special guest. Matthew Feigert, welcome.
Yeah, thanks so much for having me on.
Yeah, it's great to have you here. You've been over on TalkPython before, right?
Yeah.
Talking about some cool high-energy physics and all that kind of stuff.
Yeah, I looked that up last night just toenergy physics and all that kind of stuff.
Yeah, I looked that up last night just to try and remind myself.
That was episode 144.
I was on with my colleagues, Michaela Paganini and Michael Kagan,
to talk with you about machine learning applications at the LHC.
Yeah, and you do stuff over with CERN at the Large Hadron Collider and things like that.
Yeah, yeah.
So I'm a postdoctoral researcher at the University of Illinois at Urbana-Champaign.
And so there I split my time between working on the ATLAS experiment and working as a software researcher at the Institute for Research and Innovation and Software for High Energy
Physics, IRISHEP.
And so on ATLAS, ATLAS is this huge five-story tall particle detector that lives a hundred meters underground
at CERN's large Hadron Collider. That's just outside beautiful Geneva, Switzerland. And so
there I work with a few thousand of my closest colleagues and friends to try and look for
evidence of new physics and make measurement precision measurements of physics we do know
about. And then my Iris HEP work is kind of focused on working in an interdisciplinary and inter-experimental team to try and
improve the necessary cyber infrastructure and software for us to be able to use in upcoming
runs of the Large Hadron Collider and in what we call like a high luminosity run, which is going to
be way more collisions than normal. Have you guys ever turned it up to full power?
Have you turned it up to full power yet? turned it up to full power yet no so yeah the design luminosity or the design energy of the lhc is uh at something
called 14 tera electron volts 14 tv and we've been running uh intentionally at a lower operating
energy for the last couple of years at uh just a little bit below that but in the late 2020s
we're gonna suck the entire earth into it and that kind of stuff uh you know
no no experimental evidence of black hole creations yet but kind of the cool thing is that if we even
did make a black hole the lhc uh due to something called hawking radiation it would evaporate well
before it could actually ever do anything interesting gravitationally but yeah it's
really exciting really but i'm joking but it's such a cool place, such cool technology.
I mean, that's right out of the edge of physics these days.
And the technology side is neat too.
Yeah, no, it's super fun.
Well, welcome over to Python Bytes.
Yeah, it's great to be here.
Yeah, it's great to have you.
Thanks for coming.
And Brian, I think let's start with another one of my favorite topics.
Farms?
I love farming. you see the bumper sticker
no farms no food i like food a lot so i love farms no no but the farm stack we've heard the
lamp stack other stacks like lamp is not as useful as farm right farm sounds more useful
so tell us about farm so uh aaron bassett he's um i i'm not sure, I think he's one of the spokespeople for Mongo or something,
advocate or something like that.
Anyway, he's doing, he wrote this article, but they've also done,
I think there's been some talks given, but this is a nice article.
It's called Introducing Farmstack, which is FastAPI, React, and MongoDB.
So I really actually appreciated the article and the code with it because there's a little GitHub to-do CRUD app that they've put together.
And the article describes basically all of the pieces of the application
using a little to-do app
but with fast api you've got this is interactive um interactive documentation mode where you can
interact with the application just almost immediately you don't have to really do much
to put it all together and and then for all your endpoints, you can actually interact with them, send data, do queries, and there's a little animated GIF to show how that's done.
But the article then goes through and says basically how the endpoints
and routes get hooked up and then uses UVicorn to set up an async event loop
and get that going, shows how easy it is to connect to a database.
And then defining models and how easy it is to set up a schema.
And then it kind of hooks up, talks through the code discussion.
You do have to write code for the endpoints and really how easy those are with all of these pieces.
The React application is kind of a minimal React application.
I'm not sure why they kind of included that, but it's kind of a neat addition.
There's a React application that's running that just sort of shows some of the interaction
with the CRUD app, and it gets updated while you're changing things through the interactive API.
And I just, I liked the demonstration of working through,
working with an API and working through changing things and seeing it show up,
having a, like a React app at the other end.
It's kind of a fun way to kind of experiment with an API.
This is a really neat thing. And one of the other major stacks that's kind of a fun way to kind of experiment with an API. This is a really neat thing.
And one of the other major stacks
that's been used around Mongo is the mean stack.
And the farm stack is way nicer than the mean stack,
not just because it uses Python and not JavaScript,
but there's some interesting things here.
One of the examples is actually kind of blowing my mind
in that it's an if statement using the walrus operator
awaiting an asynchronous call in an API method.
Like the walrus operator and async, the await keyword, I've never seen those together.
And it's kind of like, it's inspiring.
It's nice. It's good.
Yeah.
It's such succinct code as well.
It's super nice.
I mean, it uses FastAPI, which is fantastic.
It's using Motor, which is MongoDB's
officially supported Python async library, because you need an async capable library in order to do
things against MongoDB. You know, this actually comes from the developer blog at MongoDB.
There also are some ORM-like things, some ODMs, object document mapper stuff, that also supports
async and await from mongodb
so if you're more in the orm style you might check that out but other than that this looks
pretty neat to me yeah yeah and i i do know that a lot of people use the orms but i like i
appreciated the example without an orm uh for people because you throw an orm example in there
and then people that don't use that particular one get lost so yeah matthew do
you guys do anything with mongodb any of these kind of things fast api uh yeah i have some friends
that do i personally myself i'm um not too versed in mongo but uh i know i've heard it on the show
and many many times elsewhere so this is i think also just uh just kind of paging through the
article as ryan was talking about it is pretty impressive so
it's really concise like here's your four lines to completely implement the API
yeah type of things right asynchronous fast like all the cool stuff yeah yeah there was an example
a case study of MongoDB being used at the large hadron collider but that was many years ago and
I don't know if it still is so it's i've completely forgotten where that is yeah but um yeah yeah cool cool so next thing i want to talk about another programming
language last time brian i went on and on maybe the time before two times ago about dot net and
c-sharp because anthony shaw had had done that work on pigeon to get python to run on dot net
and we're like well why are we talking about c sharp on this project right on this podcast well i want to talk about something even more advanced apple
script wow cutting edge yes it's like the cmd shells script of apple it's i don't have you
ever programmed an apple script it's painful no i've not it's like you say like tell this
application that to like make a command.
Oh, it's bad news bears.
Let me tell you.
So, what I've come across is this thing called PyAppleScript.
Now, this is not brand new, but it's brand new to me.
And there's a lot of talk about Macs and people maybe getting new Macs.
So, I thought I would say, hey, look, here's a cool way to automate your Mac or Macs within your company or whatever with Python instead of
this dreaded NSAppleScript. So basically it's a Python wrapper around NSAppleScript,
allowing Python scripts or applications to communicate with AppleScript and Apple
Scriptable applications. So apps for which they basically implement AppleScript and AppleScriptable applications.
So apps for which they basically implement AppleScript and let you do that.
So scripts get compiled either from source or they can be loaded from disk.
They have these, some of these ideas are from AppleScript as a standard run handler and user-defined handlers can be invoked with or without arguments.
They're automatically converted.
The responses to and from AppleScript are automatically
converted either from AppleScript to Python types like Python string versus AppleScript one or vice
versa, right? So you don't have to do the type coercion, which is cool. And they're persistent.
So you can call your handle multiple times and it retains its state like AppleScript would.
And it also has no dependency on the legacy AppleScript library
or the so-called flawed scripting bridge framework, which is limited to OSA script
executables. So that's pretty cool. If you want to automate things on your Mac, you obviously
could use Bash. But if you're talking to some kind of application that implements one of these
scripts, like for example, you want to tell this other application to grab something out of the clipboard
and then tell it to do something or something like that, right?
Like you couldn't reasonably do that with Bash, right?
Once it starts up, you kind of want to go back and forth with it.
So it sounds like Apple Scripts might be the thing to do.
Pretty cool, huh?
Yeah.
Yeah.
I mean, not a lot to it.
Like if you've got a script, your Apple macOS stuff, do it with Python.
You don't have to do it with that Apple script stuff.
No, it's neat.
Yeah.
Yeah.
So, Matthew, you probably brought something to do with physics, data science, I'm guessing.
What's your first one here?
Yeah, a bit.
So we currently live in this, like, really nice age of having awesome CI services and all these really nice metrics for all your
GitHub projects and everything. So, you know, if you're, I'm thinking of like coverage. So if you're,
you know, using PyTest and, you know, making sure that you're reporting your coverage,
you have all these really great services to also track your coverage and report that in
iShiny badge. But let's say you're developing some tool or some library and you have some,
some sort of performance metric that you care about. Let's say like how fast some some the
speed of evaluation for certain expensive functions. And you actually want to try and
like track that through the entire history of your code base. And that's not something that's
like traditionally very super easy to do. So recently, I was really happy to find.
So if you're making changes, so if
you're going to be adding some feature or whatever, you are refactoring it. So it's easier to write,
but you're not sure if that makes it faster or slower. This would sort of give you that information
from week to week or something like that. Exactly. Yeah. So you might like, you might go ahead and
say like, okay, well, you know, I have like some, some tests that make sure that this function
evaluates and under some period of time
if it's an expensive function for your test.
But let's say you actually want to like track
across like different parameterizations,
how that function actually is being performing
and evaluating it in your whole code base.
So I've recently found this super cool tool
written in Python called AirSpeed Velocity.
And so from the docs, ASV, AirSpeed Velocity, is a tool for benchmarking Python packages over their lifetime.
So it deals with runtime, memory consumption, and even custom compute values.
And the results are then displayed in a super nice web front end that's interactive and basically just requires
like static web page hosting uh so um it's it's pretty impressive and just if you click on the
docs you can see that's developed by a community of people but um led by uh michael uh dorit boom
i'm probably getting your name wrong very sorry and uh paulie uh burton uh but if you look at he's the guy that
who was behind uh pi oxidizer at mozilla oh really oh okay yeah that's a super cool project so yeah
for sure yeah um yeah and so i mean if you look at the other people that are on the contributor
list you can you know spot a lot of names that are uh common in SciPy and Jupyter ecosystem. So you already know that this is a nice community-built tool.
And then also, as kind of some example cases,
they give current projects that are using it,
like NumPy and SciPy and AstroPy.
So pretty well-established projects.
And just as kind of like an example,
if you click on the SciPy project and go to the interpolate function there,
you can just kind of look at a very nice visualization of the actual evaluation in time
on the vertical axis across a whole bunch of parameterizations, such as like CPython version
and number of samples that are being run. And you can see this for the entire lifetime of the code
base, and you can zoom in on any section just with the mouse.
And something I think that is super, super cool is if you're looking at the visualization of the plot and you see that, oh, there's like one commit where all of a sudden things go funky and the evaluation time just jumps up.
You can just click on that node and it immediately opens up to that commit in GitHub, which is, I think, super awesome that you don't have to go and like search through your commit history to figure out what like where that corresponds
to.
It's just boom, right?
I'm looking at it shows the the Shaw from GitHub.
Yeah, the the the unique identifier of the commit.
That's crazy.
Yeah.
So, yeah, so I've I've, you know, a project that I'm working on, we've been interested
in trying to have the sort of like metric tracking for some of our for some of our work so this is something that i'm actively kind
of uh looking at how we might be able to deploy this for one of my projects with my co-authors
but it's openly developed on github it's up on pipe pi pi as well so just pip install asv
and then i think something that's kind of very cute and very kind of Pythonic is that if
you when you go to the reporting dashboard for the different libraries that you're actually
benchmarking, it will up at the top, say the airspeed velocity of an unladen X. So the airspeed
velocity of an unladen like NumPy or an unladen SciPy. So, you know, keeping very true to the,
you know, Python's roots roots there there's some monty
python uh the the show zen in there for sure exactly yeah this is impressive i mean brian
how do you see this fitting into like testing and stuff i actually love this i i could use this
right away there's um there's lots of well a lot of times it's it's not um yeah performance of
performance is always something we care about and
and benchmarking systems um and you know testing uh it's always it's something you forget about
sometimes you like running um running stuff and it still works but like over time things slow down
and it's good to good to know that yeah and if this could just be automatic and just part of
your ci you just go back and see the updates.
That'd be very cool.
Definitely.
Yeah.
I don't think that this is something that at the moment, and I'm happy to be corrected about this, I don't think at the moment there is some way that this is currently being given as like a CI service.
But I think that this is something that you could like set up and run for yourself pretty easily.
Yeah, you could probably plug it in.
Yeah.
Yeah, exactly.
But you could probably do some kind of web hook when there's a check-in, automatically
kick it off and then save a result, right?
You could just hook into the GitHub actions and then have it just call you back and start
your, you know, let's take a record of this or whatever.
Yeah.
Yeah, very cool.
This is a great idea.
Yeah. Something else that I haven't really investigated yet, but that I'm looking into is if this let's take a take a record of this or whatever yeah yeah very cool this is a great idea yeah
something else that i'm i haven't really investigated yet but that i'm looking into
is if this can also be used uh to do like gpu benchmarking so like let's say you have a library
that you know also that is going to be uh you can transparently uh use the apis to transparently
move from cpu to gpu like you have something like Jax or TensorFlow or PyTorch, then this
might be kind of a nice way if it's based on those to be able to benchmark your GPU
performance as well.
Yeah.
Well, and that's one of the things you might not test, right?
If it could run either way, you might just run it on your machine, whichever one of those
it is, and forget to try the other style, right?
Exactly.
Yeah.
And I don't think there's too many ci services that are gonna you know
generously give you some like really nice gpus to be doing benchmarking on yeah that's for sure
for sure all right now for the next item let me tell you about our sponsor this episode is brought
to you by tech meme the tech meme ride home podcast they've been for two years recording episodes every single day.
So they're Silicon Valley's favorite tech news podcast.
And you can get them daily, 15 to 20 minutes, exactly by 5 p.m. Eastern, all the tech news you want.
But it's not just headlines, much like By Them Bytes, actually.
It's a very similar show, but for the broader tech industry.
You could have a robot read the headlines or just flip through them.
But it has the context and the analysis all around it.
So it's like tech news as a service, if you will.
So the folks over at TechMeme, they're online all day reading to catch you up.
And just search your podcast app for the Ride Home and subscribe to the TechMeme Ride Home podcast.
Or just visit pythonbytes.fm slash ride to subscribe i have a theory a hypothesis
about uh this i think that probably actually be a ton of work to put together a show daily
on a time like that but it's great that they're doing it do you have any other hypotheses brian
yes uh my hypothesis is that um there's not enough examples out in the world of how people are using Hypothesis in the field in real-world
applications.
So I'm
excited that Parsec put it together.
So Parsec... Well, let's take a real quick
step back just for people who don't know.
What is Hypothesis? Oh, okay, right.
Hypothesis is a testing framework.
Well, it's not really... It attaches to
other testing frameworks, so you can use it with
Unit Test or PyTest. You you can use it with unit test or pi test.
You probably should use it with pi test.
But it's a way, instead of writing a declarative single test or test case, you can.
It's a property-based testing.
So you describe kind of.
It's not like I expect one plus two equals three. I expect if I add two integers and they're both positive that the result is going to be greater than both of them.
You know, you have like these properties that you describe what the answer is.
And the examples that Hypothesis and other tutorials on how to use use hypothesis um have given are more of these
like a plus b sort of things they're simplistic things and i and i do i do see a lot of value in
hypothesis and i know a lot of people are using it but there haven't been a lot of good descriptions
for really how it's being used how like a real world example of how it's being used because um
i'm i'm probably not going to,
I don't have those little tiny algorithm things.
I've got big chunks of stuff and in hypothesis does have to run the test many
times.
So how do you do this effectively on a large project?
So I love seeing this article.
So Parsec is a,
is a it's a clientside encrypted file sharing service.
I'd never heard of them before, this blog, but it sounds cool.
It's cool.
They describe themselves as the zero-trust file sharing service like Dropbox, where it's end-to-end encryption for Dropbox.
Yeah.
You could share the files, but it only matters if you actually have the key, right?
Right.
Actually, I have no idea.
Sure.
I suspect so, yeah. It sounds like a cool service, actually.
It sounds pretty neat. But they so they describe what kind of what they're doing there.
And some of the problems. It's a it's a large four year old asynchronous Python project.
And and then they describe this RAID redundancy algorithm that they need. It's fairly complex with a bunch of servers and stuff,
a bunch of data stores going on.
And what they need to test is they need to check things like
if the blocks can be split into chunks
and if the blocks can be rebuilt from the chunks that were split up before.
And then if you can rebuild them if you've got missing chunks.
And so this all sounds fairly if you can rebuild them, if you've got missing chunks and, and so this,
this all sounds fairly, you know, yeah, I can understand how you could try to test that, but there's a lot of variables in there. How big is the chunk size? How many chunks,
how much stuff should be missing? Um, and all that sort of stuff. And, um, and that then they're,
they're thinking, yeah, hypothesis would be good for that. The normal tutorials talk about a stateless way to test with hypothesis,
but they're saying that for them, the stateful method that is supported is very useful
because they're an asynchronous system and they describe how to do that.
It's actually a fairly complex description and it's,
it's kind of a lot to get through,
but it's neat that the power's there.
So it does,
you know,
walks through how they,
exactly how they set up a test like this.
And this is something I think the,
the testing community of considering hypothesis has been missing.
So this is great.
They, they end with a, some recommendations, which, the testing community of considering hypothesis has been missing. So this is great. Um,
they,
they end with a,
uh,
some recommendations,
which,
um,
I it's,
it's great.
So the recommendation is for parts of your system that,
uh,
which parts should you throw hypothesis at?
That's a really good question.
Cause you don't want to throw it at everything.
Right.
Um,
cause there is some expense to set it up and also to run everything.
So there, they describe it as if you're, um, everything right um because there is some expense to set it up and also to run everything so there
they describe it as if you're um if the piece you're testing is kind of an encoder decoder thing
like theirs is you're splitting things into chunks and then rebuilding things um that it's a
hypothesis is a no-brainer for that because you can you can compare is that is my input the same as the, uh, encoded then decoded output. Um, the,
the other cases, if you have a simple Oracle, simple Oracle,
like it's simple to test the answer,
but it's complex to come up with the answer. Um,
I'm not sure how what that is, but in the case, you know,
some of the cases are, um, you know,
I've got a complex system and, and I just,
there's properties about the output
that are easy to describe uh the other one is uh yeah it's i guess similar as if it's hard to
compute but easy to check um well one example that just jumps out at me right away is anytime you
have a file format i'm going to save this thing be able to save and load these files right because
all you got to do is load up a whole bunch of random data say save load is it the same if it's not that's a problem yeah yeah yeah and actually
um i have talked with some people that uh that have thrown this at um uh some of the the standard
library um modules just on the side to test uh because there's a lot of standard library stuff that's like kind of encoding, decoding sort of thing,
or two-way conversions.
Yeah, cool.
Yeah, this is super nice.
I'm going to have to really dig into this article in more detail.
I remember the first time I learned about Hypothesis
was when one of the core devs gave a talk at SciPy 2019,
and it just blew my mind then.
And so this is so cool to see this very,
very interesting application here. Yeah. Yeah. It seems like there's a lot of uses in data science.
Data science seems tough to test, like that scientific computation side, because slight
variations, you might not get perfect equality, but it's close enough, right? It's like, well,
it's off, but it's like you know 10 to the
negative 10th or something off right that doesn't actually matter but the equality fails yeah yeah
you end up using numpy as uh you know uh numpy as approximation comparison schemes quite a bit in
your in your yeah in your pi test i can imagine i can imagine very cool all imagine. Very cool. All right. Next one, Brian, I told you about last time I talked about, I'm still waiting on my Mac
mini, right?
I ordered the Apple, the M1 Mac mini maxed out and I'm a little bit jealous.
My, my daughter is getting a new Mac mini.
She doesn't, or Mac air.
She doesn't know what about, but it's supposed to show up tomorrow and mine's still weeks away.
And I don't think that that's very fair.
But if you are an organization
that depends on cloud computing
and what organizations don't these days, right?
They almost all do.
It was just announced at reInvent
that AWS is going to be offering Mac instances
as a type of VM.
So until now, you've been able to get Windows, Linux. That's it.
So for all those people out there who are offering some kind of iOS app, even if they're not like a
Mac shop, they still have to have Macs around because you can't compile and sign your IPA,
your Mac, whatever iPhone app format is. You can't create those without a Mac. So there's
all these Macs that are around for like continuous, you know, CI CD or checking those things and whatnot. So now you can go to AWS and say,
I'll take a Mac mini, please. That's pretty cool. That's cool. Yeah. So you can do your tests up
there and they don't have M1 yet. Those are the Intel ones, but the M1 chips are coming later.
So you'll be able to do it. What's interesting about this offering from AWS as basically any cloud service,
you would imagine it's a VM, right?
But these, when you say I want one of these,
you actually get a dedicated Mac mini.
That's, you get pure hardware.
Well, that's why you can't get yours
because Amazon bought them all.
They did.
They had a huge truck full of them.
Well, they bought the Intel one.
So those were on sale, I bet anyway.
But no, they have some interesting, what do they call it nitro i think they call it their nitro service or something
like that which allows them to virtualize actual real hardware so this is pretty neat you can sign
up the billing is interesting you have to pay for at least one day's worth if you get it which i
think is like 24 if you going to run it continuously all
the time, this is one pricey sucker. Like the Mac mini you can get now is $700. This is $770 a month.
Oh, okay.
So if what you need is like a couple of Mac minis, you're probably, and you need them on all the
time, you're probably better off just buying a few and sticking them in a closet,
especially the M1s.
But if you just need one on demand
every now and then,
or you need to burst into them
or something like that,
that could be interesting.
Yeah.
Yeah.
If you're back old school
and you only release
like once every three months.
Well, there was some conversations like,
well, if your data
is already stored in S3
and you have like huge quantity of data
and what you need to run is actually running like some video processing on the Mac, you
could do it by the data instead of transferring that kind of stuff.
Things like that might be interesting.
I don't know.
I would go ahead and throw out there also that this is all interesting.
I have links to this kind of stuff and whatnot, like the blog post announcing it and so on.
But there's also this thing called Mac Stadium.
And if you look at Mac Stadium, it's pretty interesting.
You go over there and say,
give me a dedicated bare metal Mac mini in their data center,
$60 a month.
So you can actually get a decent one
for a decent price over there.
So if you just want one running all the time,
it might be good.
But the thing is, if you're already deeply integrated to AWS, maybe this is a good price over there. So if you just want one running all the time, it might be good. But the thing is, if you're already like deeply integrated to AWS,
maybe this is a good thing.
Yeah.
Yeah.
Matt, is there anything you...
Yeah, go ahead.
I was just going to say,
this seems pretty interesting.
I mean, I know one of the reasons
that I love using GitHub Actions
and Azure Pipelines
is the ability to be able to get access
to Mac VMs for builds.
But if you... I could also see this being really interesting and useful
if you have some very huge application
or some very large stack that you
want to be able to do CI or tests on,
that this could be really, really nice,
especially if you don't just want
to be pounding and destroying one Mac over and over
and over again.
This is nice, especially if you have a distributed team.
Yeah. Which every team is basically a distributed team.
Yeah. Yeah. Welcome to 2020.
One thing that's interesting about this is you can literally press a button or even just through the AWS, probably the Bodo API, you can just make a new Mac instantly. Like within seconds, you can have a clean,
pre-configured Mac.
You can create AMIs,
the Amazon machine image,
which are like,
I install a bunch of stuff
and get it set up
and then like save it
so I can respawn new machines from it.
Those are pretty interesting options
that just having a Mac meeting in the closet,
you know, push a button,
make a brand new one,
try this, throw it away,
make it a different way,
throw it away.
Like there are some use cases here that could be interesting. That said, I won't be using
it. I'm just going to buy a Mac mini if I can ever get it. All right, Matthew, what's this last one
you got for us? Yeah, I don't have any clever transition, but all right. So maybe, I don't
know about you, but I end up having to deal with a lot of JSON serializations of different statistical models
and sometimes also getting CSVs of different data sets that I want to be doing analysis on.
And your first instinct might just be to say, okay, I'm just going to open this up in Pandas
and start to get to work on it.
But if you kind of are used and comfortable to working in the Linux command line,
kind of ecosystem of data tools,
you might be itching a little bit
and wanna kind of just peek inside
at the command line level and kind of get to work there.
And so in that case, you might be really interested
in this tool called VisiData.
So VisiData is written on-
This is blowing my mind actually.
Yeah, it's like when I saw this, my jaw was kind of on the floor.
So we'll make sure this is linked in the show notes because it has some really cool videos.
But so from the docs, so it's visit data is described as data science without the drudgery.
So it's an interactive multi-tool for tabular data.
It combines clarity of spreadsheets with efficiencies of being at the
terminal and also, you know, the power of Python 3 on a really lightweight utility that can handle
millions of rows with ease. I can attest to that personally. I've opened up like four gigabyte
CSV files before and it just, you know, drops right in and starts asynchronously loading like
a champ. In addition to that, it supports kind of a really astounding number of file formats that it supports.
Currently on the website, it says it supports 42 different file formats.
So it supports things that you would expect like CSV and JSON.
But then it also supports things like Jira.
I guess like whatever Jira uses for their sort of like tabular stuff.
It also can like read my,
my SQL.
And I guess it can also even deal with PNG,
the image file format,
which I was,
you know,
impressed by.
So this is all openly developed.
The output is a terminal,
right?
Yeah.
Like text.
Yeah.
Yeah.
So this is all openly developed on GitHub by a guy named Saul Pawson, I think.
And if you go to the if you go to the Visadata website, it also has plenty of links to live
demos of him doing kind of interactive examples of visualizations. There's one lightning talk
that he's given that I think PyCascades 2018 or something like that, where he's able to basically find complaints about rodents and
then filter on rat complaints and then plot that inside a visit data still on the terminal to
basically make a visualization of like rodent distribution in the New York City boroughs.
So I thought that was, you know, quite amusing and really cool. It's also, you know, this is a
Python application. So you might not want to,
you know, continuously install this in every single virtual environment you make.
So, I mean, it is up on PyPI, so you can just do pip install visit data. But since it's an
application, you probably might also want it just kind of as a generic tool in your machine. So
it's distributed through a lot of nice common package managers.
So if you're on Linux, they've got it on Apt, as well as things like NX and GUX.
But I didn't see it on Yum.
So if you're on Fedora or CentOS, you might be a little bit out of luck.
You might have to do it manually.
It's, of course, on Homebrew and even CondaForge. And it's not listed there, but a very, very cool tool
that's been featured on the show before,
which is PipX by Chad Smith.
Yeah, PipX is awesome.
It's so good.
I love it.
I tested this last night.
I just fired up a Python 3.8 Docker container
and went ahead and installed PipX
and then used PipX to install Visadata
and was able to drop right into Visadata as expected so it's a very very cool and just the the power that you can have with it i think is
worth checking out for anybody who is doing data analysis with tabular data this is super cool i
love when people build these tools that are kind of you don't really expect them to be so powerful
and you talked about how you just dropped in and grabbed some random data and started answering questions.
And that's super neat.
Yeah.
Yeah, the number of inputs.
And because it's open source and because of all the other examples of data types, I think even if you have a different data type, it shouldn't be too hard to modify this to handle something different.
I do notice I'm excited about it.
It does have PCcap files for packet
capture these are for communication uh packets talking to all your devices and all your hardware
at your company right well this is like even wi-fi packets and cellular packets uh that's how we
debug those so nice it's very cool yeah very cool and pipx is great uh i install a bunch of apps
like glances which is a fantastic
like visualize the state you know like top but way way better the hdpi which is great for it's
a better but much much better curl but the most important thing i install that way is a pie joke
so now i can type a pie joke on my command line and we're always right there so speaking of which uh move on to our extras
that's that's our all of our main topics brian you got anything this week oh i did i was i haven't
dropped them in where'd my extras go yeah well you got it i just wanted to bring up that uh the
uh pycon 2021 is going to be virtual and uh there's a's a website up, um, it's us.picon.org slash 2021.
Um,
and,
uh,
there's not a lot there yet,
but you can check out what's going to happen.
I am.
It's not surprising that this there,
they have to start planning it and there may as well plan it as a virtual
event.
Um,
it's kind of hoping that we would have live,
but I understand.
Yeah.
I mean, hike on is my geek
holiday i love it's both work but it's also just such a nice getaway to connect with everybody
you everyone else we know from the community um listeners i'm gonna miss not having it yeah i'm
glad do you attend sorry brian yeah it's good that they they're i always check whenever they
announce the date to make sure it doesn't overlap Mother's Day.
Oh, yeah.
That's not good.
Yeah.
So I have unfortunately not attended PyCon yet in person,
or, I mean, well, it was canceled this year,
so maybe I'll attend this year remote.
But I'm a regular attendee of the SciPy conference,
which this, so this past year, SciPy 2020 was moved online. And I thought that the organizers
did a fantastic job of actually writing it online while still, you know, keeping kind of that SciPy
community feel. So that was helped a lot also by, you know, plenty of bad puns. So I think that
might be something
that still comes through for a pycon 2021 maybe yeah absolutely um one of the live listeners
muhammad said uh asked if it's gonna cost money or if it's gonna be free this year to attend
did you notice anything brian i haven't looked i'm looking around and i don't know that it costs anything it's from what i can
tell i don't see any pricing what i saw was sponsor information to get sponsors to sign up
to be part of whatever they're doing there but i i can't tell yeah somebody knows throw in the
chat or put it into the you know visit pythonbyst.fm slash 211 and put it in the comments down there
all right i got a couple here. First of all,
we're trying out live streaming here and I think it's going pretty well. It seems like
it's working out. There's a bunch of people watching. So if you want to get notified and
we happen to keep doing this, just visit pythonbytes.fm slash YouTube and it should have
like the scheduled upcoming live stream. You can like get notified there. So we'll, maybe we'll
keep doing this. It's been fun. Thanks for everyone out there who's watching right now. And in addition to PyCon,
which you just announced or mentioned the announcement of, that is the main way that
the PSF is funded. But they're also doing a dedicated offering sort of fundraiser thing
with six companies to help raise some money for the PSF and TalkPython training is being part
of that. And 50% of the revenue of a certain set of our courses that are sold during the month of
December goes directly to the PSF. And people who buy those courses through the PSF fundraiser also
get like 20% of a discount. So there's a link in the show notes for people to take some of our
courses and donate to the PSF. If you'd rather just directly donate, that's fine. But if you're percent of a discount. So there's a link in the show notes for people to take some of our courses
and donate to the PSF. If you'd rather just directly donate, that's fine. But if you're
looking to get some of our courses anyway, you can do it this way and support the PSF.
They're hoping to raise $60,000. Hopefully we can do that for them and we'll see.
And Brian, you announced BigPyCon. Another thing that got announced is SmallPyCon,
PyCascades, Cascades being the mountain
range that connects Portland, Seattle, and Vancouver. And traditionally this conference
is cycled between those three cities. I don't even remember anymore what it was supposed to
be this year. I think it's supposed to go back to Vancouver, but it's not going to Vancouver
because nobody's going anywhere. So PyCascades is online and those do cost money. It's $10 for
students, $20 for individuals and
$50 for professionals to support that conference. But I'll link to that one since that's one of our
local conferences, if you will. Yeah, they're trying to push, they often push what's going on,
what kind of try new things. So it's a neat conference. Yeah. Yeah. I enjoy my time there
as well. All right, Matthew, what are you got for us? Anything else you want to get a shout out to? Yeah, just a few items. So Advent of Code 2020 has started now. It's day two,
but there's still plenty of time to get involved with that if you want to. And for those of you
who might not know, Advent of Code is just an annual kind of coding challenge that takes place
every December. And it's just basically 25 days of fun and interesting programming challenges.
So it's always a great opportunity to try and brush up on your Python and maybe learn about
some interesting collections that you might not have known about in the standard library.
So that's going on right now, worth checking out, I think. And then I'm going to sneak in some very small physics-related follow-up to
Python Bytes episode 205, in which awkward arrays were talked about. So the lead developer of
awkward arrays is my friend and colleague Jim Pivarsky, who is one of my Scikit-Hep
co-collaborators, as well as also a member of iris up and as of today which is
recording december 2nd uh awkward v 1.0 is a release can is up on pi pi so by the time that
this goes live if you just do pip install awkward you should get awkward 1.0 releases instead of
having to do no more awkward one exactly no more Exactly. No more awkward one, no more awkward zero.
All that jazz. It's so good to have the actual install statement
be awkward itself.
Exactly.
So that's a nice little tidbit.
And I think that there's some nice links
in episode 205
if people want to learn more about awkward.
But that's kind of a backbone
of kind of the Pythonic ecosystem
for physics right now.
And then finally,
I just want to give some
kudos to Python Bytes as well, specifically for making full transcripts of the shows available
to view on pythonbytes.fm. Not only is this, I think, like a cool idea in general, but I think
this also makes the show more inclusive to the deaf Python community, which is definitely out
there. And one of my good friends and co-authors is deaf.
And I know that he definitely appreciates this.
So good job on you guys for being more inclusive of the wider community.
Oh, that's so cool.
I didn't know anybody was utilizing it.
Yeah, that's awesome.
Thank you.
I think it's absolutely critical for that because the format is only audio.
But a lot of folks have reached out and said they also appreciate it if they're English as a second language.
And they're not as good with English as well.
So that also helps, I think.
They're like, what was I saying again?
What a weird word.
Awkward array?
Why would they talk about that?
It doesn't make sense.
Yeah, transcripts and closed captioning is just more inclusive for everyone.
So that's awesome. Yeah, thankss and closed captioning is just more inclusive for everyone. So that's awesome.
Yeah, thanks.
All right.
Well, let's wrap it up with a joke, huh, Brian?
Yeah.
All right.
So you guys, I'm going to need your help here.
I'm going to let, Matthew, I'm going to let you pick.
Do you want to be Windows or Apple?
I'll be Windows.
All right.
Brian, you'd be Apple.
So the idea is like the title here is how to fix a computer, any computer. So instructions for Windows. Go ahead, Matthew.
So step one, reboot. And then the flowchart goes to did that fix it? If no, proceed to step two. Step two, format your hard drive and then reinstall Windows. Lose all of your files and quietly leap.
Brian, Apple doesn't have that problem. There's some totally different solution there.
Okay. For Apple, it's step one, take it to an Apple store. Did that fix it? If no,
proceed to step two. Step two is buy a new Mac, overdraw your account, and quietly weep.
That's me right now. All right. I got the Linux fix. It's so easy. It's totally like,
you don't need those things. So you learn to to c you learn to code in c++ you recompile the kernel you build
your own microprocessor out of spare silicon you have laying around you recompile the kernel again
you switch distros you recompile the kernel again but this time using a cpu powered by the
reflected reflected light from saturn you grow a giant beard. You blame Sun Microsystems.
You turn your bedroom into a server closet
and spend 10 years falling asleep
to the sound of worrying fans.
You switch distorts again.
You abandon all hygiene.
You write a regular expression
that would make any other programmers cry blood.
You learn to code in Java.
You recompile again,
but this time while wearing your lucky socks.
Did that fix it?
No.
Proceed to step two.
Revert back to using Windows and Mac,
or Mac, quietly weep.
There's really no good outcome here.
They all end in quietly weep.
As a Linux user for the better part of a decade,
I can neither confirm nor deny
how accurate that last part is.
Yeah, they all have their own special angle.
It just takes longer to get there with Linux
to get to your destination, I guess.
Yeah.
All right.
Well, that's fun as always.
And everyone watching on YouTube,
thanks for being here live and everyone listening.
Just thank you for listening.
Matthew, thanks for joining us.
Hey, thanks so much for having me.
This was really fun.
Yeah, yeah.
Great for the items you brought.
Enjoy them.
And Brian, thanks as always, man.
Thank you. It's been fun. Yep,, thanks as always, man. Thank you.
It's been fun.
Yep, yep.
See ya.
Bye.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at pythonbytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Auchcken, this is Michael Kennedy. Thank you for listening
and sharing this podcast with your friends and colleagues.