CppCast - Unit Testing
Episode Date: September 3, 2020Rob and Jason are joined by Oleg Rabaev. They first discuss some papers from the latest ISO mailing and a new feature in Microsoft's vcperf tool. Then they talk to Oleg Rabaev about Unit Testing metho...dologies and why it's important to write testable code. News 2020-08 mailing Introducing vcperf /timetrace for C++ build time analysis Question regarding optional virtual destructor in C++20 Links Bloomberg Blog Sponsors Clang Power Tools
Transcript
Discussion (0)
Thank you. Now, get ahead of bugs by using LLVM static analyzers and CBP core guidelines checks from the comfort of your IDE.
Start today at clangpowertools.com.
In this episode, we discuss ISO papers and a new build time analysis tool for Microsoft.
Then we talk to Oleg Rebave.
Oleg talks to us about unit testing
and the importance of making your code testable. Welcome to episode 263 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing good.
I wanted to ask more about this announcement I saw on Twitter right after recording last week.
Is that something you wanted to talk more about?
Well, I mean, my announcement that i'm
working on a book was preempted was was so it's it's out there for pre-release at the moment i'm
still working on it um but uh now i'll let's let's talk about it more after i actually officially
finish the book but listeners who are curious can go and find it it's out there well can you at least say just how
far along is it so far oh so i would say probably about 85 90 done with a book on c++ best practices
and it's supposed to be um just like simple actionable items is my goal so each section
is intentionally very short and and simple okay But so as you're saying there,
you hadn't planned on officially announcing yet, but it kind of got out there a little bit last
week. Well, I hit the, I hit the make this book available. And then because I had talked about it
on Twitter that I was going to do it. And then I said, let people register to be notified when it
was available for presale. And then I made it available for presale. And then I said, let people register to be notified when it was available for pre-sale. And then I made it available for pre-sale.
And then people started talking about that on Twitter.
But I was still just going to try to keep it quiet for a little bit longer until the book was done.
I apologize then.
That's fine.
We'll talk more about it when it's finished then.
Okay.
Right.
All right.
Well, at the top of the episode, I'd like to read a piece of feedback. This was from a few weeks ago when we had Slobodan on talking about his book,
Modern C++ for Absolute Beginners.
And this commenter on Reddit says,
Stop calling it modern. It's been modern for almost a decade now.
I get that C++ is a slow-moving glacier,
but the qualifier adds no useful information
and just makes results more difficult to search for in the future.
Idioms change in post-C++26.
I really don't want to continue searching for modern C++
only to find pre-C++20 material.
I think it's probably been more than a decade.
The modern C++, that term came about way before C++11, right?
It did, but the way we all use it right now, I think
was because of Scott Myers'
modern C++ book, Effective Modern C++,
which was 2012 that it was actually released, I think.
I have to double check that.
So yeah, I'm not really sure if you're ever going to be able to get people to stop using that term.
I don't know. What do you think, Jason?
We could call it modern C++20.
That's true.
No, I don't know because there's a problem because you can be using C++20,
but still be using C++98 techniques when there's better idioms and better options available.
So how do you qualify that you're using the latest and greatest best practices and idioms that are available?
I was just running through some of these things in my book about concepts and how much they can make the code better. But most people
don't have access to concepts yet. So I don't know. How do we call that when it's time to say,
yes, modern C++ now includes concepts? I don't know. I don't know either. Okay. Well, we'd love
to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter,
or email us at feedback at cbcast.com. don't forget to leave us a review on itunes or subscribe on youtube
joining us today is olig rubev olig is a c++ developer who cares deeply about proper software
design clean code and testing experience in building large financial or large-scale financial
trading and risk management systems in-memmemory databases, and reusable libraries.
Oleg, welcome to the show.
Thank you. How are you doing, guys?
Good. I think we have to congratulate you on possibly the shortest bio that Rob has ever had to read.
All right, then.
I appreciate the to-the-point terseness here about the things that you really care about as a developer.
I like things simple and short. This was not always the case,
but as you evolve through your career,
you learn to do things simple.
Simplicity is good.
The older you get, the more you ask the question of,
how little code can I write to solve this problem?
Absolutely. There you go.
Oh, my God.
Okay. Well, Oleg, we've got a couple news articles to discuss.
Feel free to comment on any of these, and we'll start talking more about what you're up to, okay?
Okay.
So this first one is the 2020-08 ISO mailing is now available.
It's a fairly short one, but I guess they're putting this out every month, so there's only so much that makes it per month.
Any highlights for you here
jason um not particularly but more questions than than highlights from my perspective is at these
these mailings are coming out regularly right and the committees are still meeting remotely
yeah but i don't think i've heard of a single paper that was accepted for c++ 23 yet have they
been accepted is there some list of these things how do we deal with the fact that they are not
having the meetings where people vote on things and it officially moves forward i think correct
me if i'm wrong but i thought we talked to someone who said that they're basically you know voting on
these things virtually you know kind of giving a
tentative thumbs up thumbs down uh with the plan being that you know these things that we are
virtually approving now we want to actually vote in person on when we're able to okay does that
sound right maybe it's possible you made it up but it it doesn't sound false. Oleg, do you remember?
Do you know what's going on here?
No, but I like the fact that they have a proposal for Stacktrace and the transactional memory.
That looks very interesting to me personally.
Okay, so based on your bio, it does sound like transactional memory would be interesting to you.
Transactional memory is one of these things.
It was the hot new topic in the early 90s,
and then people stopped caring about it,
and there's been proposals in and out of the C++ standard.
What are your thoughts on transactional memory?
I mean, if you have a transactional memory,
they really simplify the multithreaded algorithms, right?
Correct.
Multithreaded code is extremely difficult, right?
Especially with modern processors,
there are some very optimized algorithms
like lock-free and wait-free, right?
They're very difficult to write.
And I think transactional memory would facilitate in that.
So basically, the way I look at it
is that you make a change to a bunch of memory, and
then someone else will only see the changes when you decommute the transaction.
Now, how syntactically and semantically it's going to look like in code, I haven't actually
looked at the paper, right?
But this is something that's very interesting.
All your changes are treated as atomic, right?
So just like atomic variables in C++11 and on.
So this is something very interesting.
I would definitely like to play with it once there is something stable or experimental.
Do you know, can we think of this like database transactions,
like begin transaction, do some stuff, and then commit?
Absolutely.
For example, if you're writing a concurrent binary search tree, concurrent red-black or AVL tree or concurrent hash table, it would be very helpful there, I think.
There are some hash tables, for example, implemented using lock-free algorithms, and those are extremely difficult to debug and design.
It's like a black magic on steroids, like quite literally.
So I think it would really be helpful in that sense.
And stack trace, you know, C Sharp, I think Java, they always had it.
It would be nice to have something like that in C++.
Not from the core, but actually be able to, you know,
something that allows you to look at your stack much easier than it is now.
Yeah, it looks like that proposal is based on the Boost backtrace library. Right. I haven't looked. I haven't looked.
I just scanned the headings and I like those two.
I feel like this is one of these things where, you know,
and I'm probably exaggerating here, that it's going to face problems because, you know, like the standard doesn't acknowledge that there actually is a stack or something like that, right?
Like, because this is the kind of thing that tends to happen.
Yeah.
But it's already on its, wait, where did it go? I just clicked on it.
It is in LWG, so it made it past LWG Evolution.
Is that what you were going to say?
Yeah, oh, and it's on its 6.3 vision.
So it's definitely moving forward.
So I guess I should be more optimistic than that.
Right, right.
Okay, next thing we have is an article from the visual c++ blog and this is introducing vc perf
slash time trace for c++ build time analysis uh and we had uh kevin cadeau uh and scion a while
ago talking about this feature and uh this is actually a guest post from a game developer uh
from mercury steam entertainment i don't think I'm familiar with that game development company. But he
used the C++ Build Insights SDK that we talked about with Kevin and actually wound up
submitting a new feature to it. And it's to
output this graph that you can actually view in Chrome, which is pretty cool.
Yeah, that's effectively exactly how Clang's time
trace also works okay and uh i thought it
was interesting how they found some you know issues in their own code using this one of the
things they mentioned is a particular default constructor which was defined by the compiler
that was taking three percent of their total build time. So they just went and put in that constructor themselves
and saved 3% on their build.
It's amazing.
Wow.
Yeah.
But sadly, it's only available for Visual Studio, right?
Yes, this is for Visual Studio.
Yeah.
The good old days.
Well, Clang does have similar functionality as well i forget what its parameters called now
slash analyze perhaps but yeah something else for our listeners to look into if they're interested
okay and then jason did you want to share this last one this is a post on the cpp subreddit
about yeah i just thought destructors i thought that's interesting yeah because uh the poster here
asked like how come gcc and clang are giving you know one of them's allowing this code to compile
and the other one is not and it was using a technique that i recently had on c++ weekly
for having multiple destructors in your class with c++ 20's concepts because you can constrain your destructors.
So the destructor only exists if it matches the constraints.
And this person wanted it to be virtual or not virtual.
And I just thought it was fun.
This is like, you know, just because it's so easy to ask these questions today.
If you manage to formulate your question in a way that people can like read and provide
a Godbolt link to it and whatever.
And the top answer on Reddit is a quote from the standard,
a virtual function shall not have a trailing requires clause.
Like it's just right there,
but they got like the perfect answer just by asking the right question as well
formed.
And it was timely with C plus plus 20.
So I liked the article.
I'm not even an article.
It's like a three-minute read.
That's all.
Okay.
Okay, so Oleg, so you actually reached out to us,
asked about coming on the show after we recently talked about
one of those JetBrains state-of-the-community posts.
In there, they talked about what unit
tests are C++ developers using
unit test libraries.
Over a third of
respondents to that poll said they
don't write unit tests for C++.
What are your thoughts on that?
Complete and out of
shock. Did you
cry after this?
I think if George Costanza was a C++ programmer, he would say the worlds
are colliding at this point.
I mean, I was completely shocked. Yeah, I mean, this is 2020. I can't understand
10 years ago, right? I think the unit
testing itself picked up at the beginning of the agile movement.
I wrote my first unit test probably in 2008 without even realizing that I'm writing a
unit test because I just wanted to write a small program that sort of tests a small piece
of code.
And I had like main function where I would have like a switch statement, comment it out, but I had no idea it was a unit test,
but actually it was a unit test.
And then I was introduced to the Google test framework, right?
And for me personally, it was a paradigm shift.
And since then, like since about 2010, I've been hooked.
So a unit test in general improves the quality of your code.
I think someone mentioned if you write a clean code, you don't have to write a unit test, right?
In one of the comments you mentioned on the show.
And I'm like, how can you write a clean code without a unit test?
So, I'll tell you the real story, right?
Go ahead.
So, I was working on a project.
It's actually in memory database, right?
So, I'll describe it in very generic terms.
Basically, it's multi-threaded.
It has a SQL compiler, basic data, composite indices, has alter table operations and running database.
Replication, you know, the cluster should never hold down.
It should be like 24-7 operational stuff like that.
Processes tens of thousands of messages a second.
Has a lot of lock-three data structures.
So it's extremely complex system, right?
So we hired a developer into the team who was supposed to help out,
sort of support it and extend the functionality.
And in case, you know, just something happens to me personally,
I would say, right, someone can take over.
Like Jeff Bezos decides to adopt me, right?
Is that your retirement plan?
No.
I'll never retire.
I'll always say to my family, I will die coding.
So he joins the team, and I expect questions, right?
It's naturally very friendly.
He doesn't ask me a question on day one. He doesn't ask me a question on day two, three. And I expect the questions, right? Like, it's naturally very friendly.
He doesn't ask me a question on day one.
He doesn't ask me a question on day two, three.
The week goes by two weeks ago.
I am starting to think the guy doesn't like me.
Then I turn to him.
It's like, why don't you ask me a single question?
It's a pretty complex system.
He says, Oleg, I don't need to.
Every time I want to find out how a particular component works, I just look at the test, and it's there.
I don't need to ask you a single question.
That particular system had several thousand units, units, interaction, and integration
tests.
So that's sort of, this is my philosophy personally, and I try to, you know, preach it.
I, you know, I was not always like that.
I wrote a lot of spaghetti code at the beginning of my career, just like everyone else, let's be honest.
I assume I sometimes do the same thing, right?
And it's not up to me to judge, but I always try to improve my design skills.
And I feel that there has to be some kind of methodology that allows you to achieve that easily.
For example, we all agree it's much easier to program in C++ than in
Assembler, right? Usually, yes. I'm not sure about C++ 26, right? So, like, for example,
we have a static typing in C++, right? If you have an integer and you're trying to assign
string, std string to an integer variable, the compiler will not let you do it, right?
You can't just move memory like it will prevent you from doing that.
Things like that.
So it's built into the language where the compiler sort of is guiding you
and not allowing you to do certain things, which would be stupid, right?
Now, with testing, I treat testing as my guide to a design.
Whenever I work on any piece of code, I always think, how am I going to test it?
And I don't always think about a particular library or service.
I always look at the system as a whole.
How am I going to test it?
So there are three simple rules that I try to follow. Number one, write all with small
functions, right? There are a lot of fancy jargon out there. What if you want to, let's say, promote
some sort of like developer philosophy, you want to make it as simple as possible, right? Just like
the bio. So, so, guys, we should all write small functions.
The code should be modular.
And I think several guests,
I remember just recently,
several months ago,
it was a guest who was actually pretty funny,
and he mentioned about small functions.
This is just basic stuff, right?
The next thing would be no cycles.
So, when I joined Bloomberg, you
all know, I'm sure the C++ community know John Lakers, right?
Right. He's been on the show, yeah.
Yes, yes. And I was introduced to his, he has a lot of YouTube videos, right? And I
watched his stuff, you know, and some of his ideas are revolutionary, right? But one thing that I take away from myself is the hierarchical design and no cycles under any circumstances, with some exceptions.
I'll elaborate on that.
What does it mean?
Like if you have component A and component B depends on component A, they should not be cyclical dependencies.
Oh, okay.
Or if you have anything like component C depends on component B.
Component A should never depend on component C, right?
Because B depends on A.
You should not have cyclical dependencies in functions, in classes, in libraries, in services.
You should not be, in my opinion, there should not be cycles.
And I had code prior to that with cycles like, and once I started following that philosophy,
it actually improved the design and testability of the code.
So actually, John mentioned, they asked him, I think,
in one of the talks that could be on your show,
or maybe it's from some of the YouTube videos,
what do you think the biggest benefit of modules?
So I think John said, John Liker said,
for me, it's almost impossible to create a cyclical dependency with modules, right?
Yeah, that might have been his interview on here.
It could be, yeah. So no cyclical dependencies.
And it's very easy to achieve. Just follow those rules, right? So small functions,
when you say small functions, then everything becomes small. You have small classes, smaller
libraries, smaller services.
You just break your components into smaller pieces and make them reusable.
No cycles.
And number three, tests.
These are the three rules that I personally follow.
Make your code testable and designed for testing, right?
So if you do that, and even if you don't read any other books with 1000 definitions
right which a person can get very discouraged from by just looking at stuff and feel like there's
there is this kind of test there is a approval test right regression unit test there is a
behavior-driven development there is a this this and like, where do I start? Like, I have to know all this in order to be productive.
So, I think
you should always start with a simple principle
and it is like 80-20 rules.
By following these principles,
you will cover 80%
of your design, in my humble opinion.
So,
I just give you an example.
Your tests become your documentation.
They express, you express your code through the test.
In fact, if it was up to me, let's say if I wrote, you know, you guys use Compiler Explorer or the WAN box, right?
I like WAN box, right?
But there is always a main function there, right?
To me, it shouldn't be a main function.
It should be text feature.
And instead of doing the C out and looking at the end,
I would rather use the assert markers or require markers, right?
This is my philosophy.
This is like if I started on something new and want to test something,
I don't write a main function.
I will have a test fixture, and that's where I will basically play with my code
and stuff like that, right?
Using the assert markers or expect or require,
depending on the framework you use,
instead of doing the C-outs and stuff like that.
Okay, so you mentioned like a ton of terms just now.
Yes.
So it kind of sounds like you do
what most people would call test-driven development.
Is that fair?
Yes, but not religiously.
I will explain why. I don't write my test before
I write my course.
So that is typically what test-driven
development would be, right?
Well, again,
it's one of those arguments that
a couple of camps, it's like, what's better,
VI or Emacs, right?
And some people do it this way, some don't.
I don't, personally, because it hinders my design.
I mean, it's just more difficult
for me like that
because then you start losing...
I don't want to criticize that.
So I don't want to like...
But basically, I don't do it.
But I will write a test
for every public function of my cloud.
Okay, we're C++ program.
Let's talk about public functions, right?
That's the only thing you can really test.
I will write a test, at least one test, for every public function.
Okay.
There will be more tests because you want to test function in, you know,
fulfill the contract of the function, right,
and then test the exceptions and the corner cases, stuff like that.
I don't test undefined behavior.
That I would not test.
Okay.
So, it does sound like, you know,
when you see that people do things like pound-define private-public
to be able to make their class testable.
No, no, no.
It's a no-no for me.
I just use that
personally. Well, if
you do stuff like that, then maybe
your class is too monolithic and you
should break it up into smaller classes.
If you have too many private
functions that are in itself
provide some
testable functionality,
maybe you should make them
part of another component and use that component
as a composition inside the class, right,
as a client. Right. But I
would never do this. I don't
like to pollute my code with things like that, no.
Yeah, so for the sake of our listeners,
that is possible technically
with most compilers, that you can do
that and then all of the private functions
in a class become public.
And yeah, but it's not a good idea usually.
I don't advise it, yeah.
I tried it once like a while back, but no.
You tried it.
So you said you don't want to get too caught up in terminology,
but I do kind of wanted to dig in a little bit.
It does sound like you do advocate specifically for what most people would call unit testing,
individual functional testing.
I advocate way more.
So if you want, I will elaborate on a different type of test that I personally write.
Okay.
So I think everything is a unit or integration test.
Everything else is just the derivative of it, right?
If you really think about it, and I'll explain what the unit test is in my humble opinion and the integration test.
And there is another thing.
Maybe it's a concept that I heard in one of the lecture, I believe, somewhere on YouTube.
And I like that concept, so I'll talk about its interaction test.
So what is the unit test, right?
Yeah.
I've never heard that term before.
So now is the chance.
What is the unit test, right?
So the unit test is basically, it's just a small program with minimal setup code, right?
A small program with minimal setup code, which exercises some narrow part of your public function of a class.
That's it.
Now, what is the interaction test?
Interaction test is a unit test with more complex setup.
Now, what does that mean?
As you build your component,
let's say, for example, if you're testing vector or a string class, right,
you will only write unit tests
because vector in itself is more like a vocabulary.
It doesn't interact with any other components.
But as you build your class hierarchy, right, the testing becomes a little bit more complex, correct?
Because you have to do more things just to set up the code in order to test your higher level components.
So those kind of tests I call interaction tests.
And then there are integration tests.
Integration test is either unit test or interaction test with external dependency.
That's all it is.
Meaning, if you are, if let's say you write database access,
and you have function select by parameter key,
you have an external dependency on the database, right?
Let's say it's a SQL Server database.
So in order to test your code, you have to have an instance of the database running.
So you have the dependency, and that makes it a little bit harder to test, right?
Integration testing is more hard because what if it's more complex dependence,
like you use some kind of message queue mechanism which requires the setup,
or you talk to your service over TCP IP connections, right?
That's the integration test.
Now, the community tends to merge the interaction test integration into one concept.
I split it just so to make interaction test just like integration tests, but without dependencies.
And my rule is I always try to achieve unit and interaction tests and remove external dependencies.
And there are ways of doing that, but you have to design your system to start with to do that.
And I can elaborate on that. For example, in the story that I told you
at the beginning about this large-scale
in-memory database, one of the things I was
able to test without any kind of dependency
is communication protocol, replication.
If one goes up, another one goes down,
you know, things starting, you know, exchanging information
and everything continues working.
How do I do that without having any kind of external dependencies
on file or TCP or anything like that?
Great question, yeah.
The way you do it is the following way.
There is a technique called, you know,
it's a widely used technique in
java i think i like dependency injection right so what you have let's say you have a component a
that talks to component b using component c and component c is uh tcpip interface right let's say
we're writing a pink server, a ping client.
So the ping client code just says, you know, I'm sending something using B to C, right?
C is the server, A is the client, B is the middleware.
So how do you test that?
Okay, you have to start your server, right?
That will listen and bind in the socket, and then the client will open the socket connection, et cetera.
Do you do it like that?
No.
What you do is, since you are communicating between the client and the server using the
middleware, implement that middleware as an abstract interface, which has a function send,
for example, right? And that abstract interface can be plugged into the file,
injected into the client code,
and provide in-memory implementation.
Instead of sending data over physical network,
send the data over memory,
just dump it into the vector of bytes, right?
And do the same thing for the server side.
So during your setup code, you basically replace that interface,
the actual network layer with your own implementation, right?
Which is in-memory implementation.
And you can test your system end-to-end without external dependencies.
I'll give you another example.
For example, let's talk about, let's say there is a MySQL SQL Server database, right?
You cannot test it unless you start up the service, right?
Right.
But you could do this with SQLite, correct?
I think SQLite is something that you can bring up in the process, right?
You don't need the actual service instance running, right? So if the SQL Server developers or MySQL developers
thought ahead a little bit and thought,
like, how are we going to make our system much easier to test
that would provide ability to start some kind of light database engine
on the client without any kind of external dependencies
that works completely in memory, right,
with a simple interface, and you can test your code this way.
In fact, recently I looked into one of the graph databases.
I think it's Neo4j, and they actually did that.
They said, okay, guys, if you want to unit test,
you don't have to run the database engine that is in the memory implementation.
So when I write my code, you don't always have this chance because sometimes you depend on other systems, right?
Or on other open source projects, which simply don't.
But if you're working on an infrastructure component, you should always design it for clients so they can test it easily without external dependencies.
You will start thinking differently, right? You'll say, oh, you know what?
I have, let's say, optimization engine,
which I want to separate
so that I could run my stuff
in a lighter mode.
Oh, it actually leads
to much better design.
So this is,
you do want to turn your integration tests
into unit tests.
That's the goal.
It's not always possible, but that's
what we want to do. I want to interrupt the discussion for just a moment to bring you a
word from our sponsors. Clang Power Tools is the open source Visual Studio extension on a mission
to bring LLVM tools like Clang++, ClangTidy, and ClangFormat to C++ developers on Windows.
Increase your productivity and modernize your C++ code with automatic code transformations
powered by Clang Tidy. Find subtle latent bugs by using LVM Static Analyzer and CPP Core Guidelines
checks. Don't fiddle with command line switches or learn hundreds of compiler flags. Clang Power
Tools comes with a powerful user interface to help you manage the complexity directly from the
comfort of your IDE. Never miss a potential issue by integrating Clang Power Tools into your
workflows with a configurable PowerShell script for CI-CD automation.
Start experimenting on your code today. Check it out at clangpowertools.com.
So you talked about these three types of testing that you focus on.
Do you assign weight or importance differently on them?
Like how many unit tests do you have compared to the number of integration or interaction testing?
That's actually a great question.
Unit tests are usually applicable to
lower level components, right?
And you do want to test
them in and out, like
you want to do it to extreme.
You want to have complete coverage.
Now, as you go higher level, it becomes
almost impossible. Because the amount
of interactions, right, the amount of ways you can, you have so many components in the pipeline, right?
The amount of, it becomes extremely difficult, right?
So your number of tests will go down, but you do want to have tests that will sort of, that will sanitize your code, sort of speak, right?
I mean, in this case, like approval testing, right?
You can generate a bunch of, let's say, files, the input files, which becomes a golden copy,
right?
And then whenever you change your code, right, make sure there are no bugs.
You can run through the same regression, right?
And hopefully nothing is broken. Everything will get passed, right? And hopefully nothing is broken.
Everything will get passed, right?
But it becomes very difficult.
It becomes very difficult to test as you go higher level.
So more tests on the unit side,
there will naturally be less tests on interaction integration side,
but they have to be.
So I think 80-20 rule works here.
So there will be 80% of tests will be on the unit side
and 20% on the integration side.
Okay.
So I'm thinking of a couple of different things here.
I've known some people that say that they don't really do unit tests.
They only do behavior tests because they want to be able to swap out the implementation
as long as they still get the same behavior from the system.
Now, I was not thinking about that when I was working on my scripting engine chai script years ago which just for the record i haven't maintained in a long time now but um coincidentally
we have almost no test in there that tests the actual c++ like test like i parsed this and i
expect this ast back instead we just wrote things that only test the scripting engine.
Like if I put the 5 in, do I get an integer 5 back out?
If I do 5 plus 4, do I get 9 back out of the system?
Which gave us very good code coverage ultimately
and let us know if any of the behavior broke in the system.
But I'm curious about your opinion.
Would you say that I did a bad job testing the system i think you did more interaction tests than unit
tests all right you you do want to test every component and i think the behavior during testing
applicable to unit test or interaction test it's just it's sort of like synthetic sugar right
depending on the framework we use with like, like, if, when, giving certain right input, when, you know, you do, like,
I think Catch has the support set, right?
G-Test do not.
But I think what you did, you did more interaction tests.
And I see it all over the place.
You did a good job because you do have coverage,
but I don't think you have enough unit tests
because you do want to test your parser.
You do want to test your lexer, right?
You do want to test every low-level component.
Right.
For example, if you write a string class,
are you going to test your string class from your client code by using it,
or are you going to write unit tests for the string class?
You should probably write the unit tests for the string class, right?
That's a good point, yeah.
Yeah, so I think you have more interaction tests.
I don't think you have any external dependencies there, right, because it's a scripting language.
Yeah, I know.
So, but possibly you don't have enough unit tests.
Yeah.
And I think one of the most important tests is the unit test, because once you test your low-level components, if you have a good coverage,
you just rely on them working.
You never test your low-level components in terms of high-level components.
You just rely on them working.
That's it.
And obviously, you need the test, end-to-end test,
like you provide an input 5 in a deterrence or something else, right?
Yeah.
Those are also replaceable. Sometimes people will go wild with unit tests, like you provide an input 5 and it returns to something else, right? Those are also replaceable. Sometimes
people will go wild
with unit tests and they will shine
the integration tests. There is
another extreme like that. They will mock
everything, right? I think
mocking is useful to some extent
to break the dependency, right?
If you don't control
the infrastructure, let's say, and you're
talking to a particular service by giving some, I don't know, infrastructure, let's say, and you're talking to a particular
service by giving some, I don't know, JSON input, you get some JSON output, right? So you don't want
to start up that service and the whole infrastructure. So you mark it out. Usually what
you do, you have a class abstract interface with some methods, you know, like gets, you know,
request some data and it returns you some kind of response.
So you mock it in the real implementation.
It will go into that service and will execute code there, right?
But in your mocked implementation, you're just going to basically,
whenever you call get some data, you will return hard-coded
or whatever other means you do that.
So that you break the dependency and you can unit test your components
that interact with other components
over external dependency, right?
But some people take it to the extreme.
They will mark everything, right?
And I don't think it's useful because you could do that but you will have to write
tremendous amount of what I think could be unnecessary test because at the end
of the day you do want to do integration testing right and integration tests
usually will catch a lot of errors that one missed during the marking.
Because then you want to test the whole system.
Going back to my database example, right?
I had only one, but I did not really have marking.
I mean, I did have marking there, but it was very limited.
The way I achieved that, you can also say it's marking.
I replaced all the communication with the abstract interface so that I can test everything
in memory, right? But I would
not mark every
component interaction between
any other component. No.
I would just use the other components as
they are because they execute...
If you have a string and a vector
and you build some, let's say,
some other class which includes
the string and the vector, you don't mark don't know, like some other class which includes the string and the vector,
you don't mark that class.
And there is another class which includes that class.
Like, no, you just use those classes like you did with your scripting, right?
At the end of the day, you don't mark.
You have your parser, you have your lexer, right?
You have something else going there
and you have a client interface.
You provide an input, you get an output.
I'm a fan of that.
You should really test each component with unit tests, fine.
But at least you have proper interaction tests, right?
So, in fact, you know, once I gave the presentation,
and, you know, I decided to make it a little controversial,
and I was trying to prove the point that when you have external dependency
and if you control the infrastructure,
it's much better to instead of marking, just provide the same memory implementations and a lot
of times just test the system as a whole. So my second slide would say like marking is evil, right?
Which is obvious that that was not the case. I was just trying to show that, you know,
just wake up people, right? Now they're paying attention, right?
So I think you should know when to mark and when not to mark.
But sometimes you have to do it.
Especially marking is good when you do test error conditions.
For example, if you are testing your database, even if you have your in-memory implementation, right,
how do you generate connection errors, right?
How do you generate that something bad happened?
It's difficult, right?
Because it's difficult.
So then you mock.
That's a perfect case for mocking.
Yeah.
Another thing I wanted to mention.
I would like to say I really like that point,
that mocking is particularly useful for the error case.
I find that very interesting.
Yes, yes.
For error case, it's the perfect case for mocking, right?
Right.
Only way to consistently guarantee you're going to get an error.
Yeah.
Exactly, exactly.
But I would be against mocking every component in the system.
That would just be insanity.
And I've seen people do that.
Now, given what we just discussed, right, like I think you could see that by following these practices, right, just write modular code, don't create cycles, and test your code, it will not let you make bad design decisions.
It's like you become your own compiler, so to speak, right?
If you just discipline enough that you follow this very easy, like,
writing small functions is very
easy principle to follow.
And it's, there are
beyond, like, there are books, you know, there is a single
responsibility principles, you
have to have the smallest amount of parameters,
this, that, and like, when you try
to explain something to someone,
you want to make it simple. The more simple
you make, the easier it will
be accepted. So, I think
writing small functions is a very easy concept.
Creating no cycles is also an easy concept.
So are tests. Now,
I have my other three
sort of
statements, which I want to run by you,
which also guide my
software design.
If a component is hard to test, it's not properly designed.
Okay.
Number one.
If a component is easy to test, it is a good indication that it's proper.
It doesn't mean you can still have mess inside the code implementation,
but it's a good indication that it's properly designed.
Now, if a component is properly designed, it's easy to test.
So whenever I write a class, I'm asking myself these three questions.
In general, I ask myself, is it easy to test?
If it's easy to test, it's a good indication that it's properly designed.
And that's what design for testing means for me.
Now, I appreciate those guidelines sound good, but how do you – it sounds like a lot of what you're talking about, you're like, well, unit test, a lot, but not too much, mock some, but not all the things.
How do we start to develop this mindset, this intuition for what is good testable code.
What is good testable code?
Like, how do we, I mean, we have to practice, it sounds like.
Right, right, right.
The only way, it's a paradigm, it's a paradigm change.
For me, it was like, I find it difficult sometimes to bring it over, right, the finish line.
But once, like like once you get
it you cannot do things otherwise so unit tests a lot there is no end to unit
test that you can write now interaction integration test reasonable amount but
you are limited because they are literally you could have like factorial
explosion there right because there are so many ways you can use the components.
Marking only when you need to test for errors, conditions,
or if there is a complex dependency, not external dependency,
but even like within your project,
sometimes you would do marking just to separate another part of the system so you can test it easy.
But I wouldn't mark everything.
Absolutely not.
And integration tests, try to turn them into unit tests,
interaction tests, by removing external dependencies.
And the way to do it is use dependency injection.
Replace your communication protocols or protocols that talk to file systems or message queues,
databases within memory implementations.
It is sort of like mocking, except that the actual code is executed.
It's just whatever would be executed in the server side is actually executed in your own
process, basically, right?
So you want to turn all your tests into in-process tests.
Now, the way people
achieve this this day,
I mean,
they don't really achieve that,
but they use Docker
for the integration test
so they have the environment
because the setup
is very difficult, right?
So they have the environment,
they make sure
all the dependencies
are properly set up,
they start up the container,
right,
and they run
the integration tests
like that,
which is great,
but it's not pure in the sense that you cannot –
I tend to write my infrastructure code so that I can test it in process.
Yeah, it almost sounds like you would think of that as cheating.
I didn't want to say that, but it's a good cheating, right?
I would rather do that than not do integration tests
because to set up the proper environment is difficult, right?
Especially if you – I mean, it could be difficult, let's say it this way, right?
It is – I don't want to call it cheating.
I think it's a good approach.
It's a good compromise.
Well, I've had a couple of students say, oh, well, you know, I've got this class and I want to test it.
I need to mock it. So I made all the member functions virtual so that I can derive from it and do the mocking kind of interface with it.
And I respond to them and I say, I am not an expert in testing methodologies, but that doesn't feel right to me.
If that's the only reason you have virtual functions.
Absolutely right, Jason.
This is what I call mocks are evil.
I mean, no, quote, unquote.
Yes.
I would not do that.
No.
I would do that when I need to, when my class, let's say, returns an error condition,
which is difficult to replicate by providing input.
For example, if you are passing to database engine valid SQL statement,
you're probably not going to get connection error, right?
How do you simulate connection error?
So in that sense, if you have like some client for the database and you want to simulate connection error and see how your actual code will respond to it, marking is great.
Marking is also good when you have external dependency, you don't control it, and you don't want to bring the whole environment up, right?
So you break your external dependency marking.
Okay, acceptable.
And also, even if you don't have external dependencies, but when you have complex interactions, like really complex, which requires very complex setup, and between like two subsystems, so you can mark another subsystem just okay that's that's acceptable but not every class no okay the other thing uh that kind of you know concerns me
for for lack of a better word when i think about mocking and dependency injecting as i go, well, it feels like now I'm testing something different.
Like, I'm not testing the actual system.
I'm testing the mocked system.
Yes.
But your arguments for being able to consistently generate errors,
and it all makes sense to me, but I'm just curious,
has that ever been a problem?
Have you ever gotten down the road and then go,
oh, no, my mock actually gave me a false sense of security somehow?
Well, for errors, no.
Okay.
And for dependency injection, I just want to, like, maybe slightly correctly, I'm not sure if I understood it correctly.
When you use dependency injection the way I described it by replacing the actual protocol,
so it's a TCPAP within memory implementation, you are actually testing your whole system.
That's the whole point. Because you are executing the code
on the other side, which would be executed on the server side. You are doing
end-to-end. That's the whole point of it. You've just removed the operating
system and the internet out of it. You just removed the TCP communication.
And now, should you test with the actual TCP and P component, right?
Like instead of doing dependency injection, should you actually test with the interface which implements?
Of course.
That's going to be the real integration test.
They are irreplaceable.
But they are a little slower to run, right?
You have to have an environment, right?
So you try to make your testing easier, design for testing.
It will drive your design, and it's going to create an extremely stable system.
I mean, look, I worked in a lot of different projects, right?
I speak mostly from pragmatic experience than from, like, academia experience.
So when you do things over and over again right
and you could work in production and it's easy to maintain and i i just told you the real story
right i will not mention the company right but like i where it happened but i told the real story
you think like you must be doing something right and then why not to share it with the world
sure so my point is that in my previous experience
yes i did not write unit tests i think jason i want to ask you when was the first time you
wrote the unit test i'm actually interested like when was the actual first time you wrote the first
unit test uh well so i will say chai script that that project was the first one that i worked on
that actually had test so i will say you know you know, in my defense, but also...
No, I'm not putting you on the spot.
But also to condemn myself, in a sense, I didn't even write the first test.
My cousin who was working on the project with me is the one who wrote the first test.
He's like, we need to have tests.
I'm like, that sounds like a good idea.
So go ahead and do that testing thing, right?
Like, I wasn't used to this at all.
Yes, true.
I heard, I think
it was a panel between
Strostrop, Brock Pike,
Alessandresco, and someone mentioned
the unit tests were created for
untyped, interpreted type of
languages, because C++
gives you static time.
That's true, maybe,
but I think unit tests are useful
across
the whole industry. And in fact,
I think they do drive your design.
Whatever you do,
think, is it easy to test?
If it's not easy to test, red flag.
Your component is not properly designed.
That's simple, right?
You don't have to read 2,000 pages to understand that. You write a class. That's simple, right? Like, you don't have to read, like,
2,000 pages to understand that.
You write a class.
Is it hard to test?
Yes.
Wrong design.
Easy to test?
Yes.
Probably did a good job.
Do you mind if I quote you in the testing section
on my C++ book?
Of course.
I will send you those principles,
you know, the three things.
Absolutely.
Okay. Go ahead, Jason.
I was just going to say, it's funny, because through the whole course of this,
I've got in the back of my mind some code that I wrote yesterday that I'm like,
this code is being used everywhere, so I know that it's tested,
and it is a strongly typed interface, but I should probably go ahead and write a unit test for it.
I wasn't thinking about it for that particular function.
So, yeah, I'm going to do that when we get off the call.
The rule is like that. Every public function or every free function, that's easy, right? Every
public function at least should have a test probably with the same name, at least one,
maybe more, depending on the corner cases and the complexities, right?
I feel like I'm seeing, you know, kind of like a complete picture from what I'm hearing from you, because like you said at the beginning, that people get caught up in these different terminologies.
And it sounds to me like you're in a way saying test at every level as much as you can.
Exactly.
But be pragmatic, but never be a perfectionist because you'll never get anything done.
Right?
Yeah.
Oh, by the way, as far as
I will give you a good example.
In my code,
probably 60% of my code
attests.
Yes, yes, yes.
Most of the code that I write attests.
Is that like 100% code coverage then?
Well,
I hope so, but you know, sometimes
when you use those tools that give you the coverage reports,
just because you have coverage doesn't mean you test every code within your function.
But yes, it is close to that.
You will never be perfect.
It's impossible.
If you're a perfectionist, you will never deliver all the time you'll ever get, but
you have to be pragmatic and you have to be pragmatic, and you have to be disciplined, right?
That's the point.
So if you want to look at how much testing code you should be,
it probably should be 50% of your code should be tests.
That's my guide.
Someone will say it should be even more.
I will not argue, probably agree, but I'm at about 50% to 60%.
So looping back around to where we started with that poll, about 34%
of developers not using unit
tests. If you're one of those
in the listeners
and they've been convinced they need
to do some unit testing, do you have any
things to point them to, to learn
a little bit more, to get started? Any
resources you might suggest?
Any resources?
There are plenty of books,
but I think
the best resources would be maybe
finding open source projects
and just good quality open source
projects and take a look at how they're
tested. Also,
YouTube videos, a bunch of YouTube
videos. There's just plenty of stuff.
For me,
it was the real livelife experience, right?
There is one particular book that I would recommend.
It's called Clean Code by Uncle Bob, I think.
It's been around for a long time.
Right.
Interestingly enough, when I read it,
I did not read the section about the unit tests.
So I read it a long time ago.
And then I'm like, I read read it didn't talk about this when i
looked at look back oh wow so he's advocating writing your test before writing code which is
fine if you could do that do that but the main thing is test your code make your test to drive
your design if you have difficulty thinking how you can design it's very by the way software
design is extremely extremely difficult difficult, right? To achieve
good quality code, small functions,
simple. No cycles
and tests. If you do that,
80% you're there.
The rest are details.
I just want to, if you don't mind,
mention again that you
gave some techniques for testing error
cases because in my experience now as a more
mature developer compared to the older
projects that I started on,
it seems to me that the error handling
routines are never tested.
Like when I look at code coverage
reports on some of the projects I'm involved in,
I even submitted a test at one point
on one project that had an error in it
intentionally because I needed to test the error handling
routine, and that pull request was denied.
We don't want code with errors in it in our unit test framework.
And I'm like, but we need to test that.
Yeah, you need to test that, absolutely.
Whether you return the error or you throw an exception,
the piece of code that, like, every code path
should be executing during your unit tests.
That's advisable.
And if it's difficult
to generate the errors, right, that's when
the marking comes handy, right?
Right.
Sometimes you can generate errors simply from the input.
You provide invalid input, it's, let's say,
it's not undefined behavior, but there is a valid
error code or exception throw.
By the way, I'm a big fan of exceptions.
I hate error codes.
I have no problem with exceptions.
Yes.
Remember, like, I remember in 2016, Eric Nibler, right,
who spoke about ranges at the time,
and they're like, but you have to use exceptions.
And he said, like, but the amount of time, did you actually profile?
The amount of time you have your if statement,
which could potentially throw your compiler off or
the CPU executor
off, right? Probably
if you do exceptions, it's not going to
cause any kind of performance problems. You should
probably measure your code before you decide
whether you use exceptions or error codes.
Yeah.
Well, Oleg, it's been great having you
on the show today.
Where can listeners find you online?
Well, I'm not really social as far as Facebook and Twitter.
I have a LinkedIn profile, and my email, you can post it into the notes.
I have no problem with people reaching out directly.
And I feel very passionate about what I do. If you find, let's say, flaws
in my philosophy, please reach out. Prove it to me. Show me how it could be done better,
right? This is what I always, this is my pushback. Show me. But I do have multiple successful
projects behind me that are stable, that are working, that are in production, you know,
and that are maintainable.
And Jason, if you want to mention some of the stuff,
I can write you an email in those principles.
Just like if component is hard to test,
it's not properly designed.
I think that's specifically what I want to quote,
actually, that right there.
So I will do that.
All right, awesome.
Thank you.
Thank you, guys.
Thank you. Thanks so, guys. Thank you.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. Thank you. And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.