librarypunk - 058 - Seize the Means of Cataloging feat. Becky Yoose

Episode Date: June 30, 2022

This week we're diving into the OCLC lawsuit against Clarivate for its MetaDoor product, the implications of Roe being overturned, and ALAmageddon.  Becky Yoose (@yo_bj) / Twitter  Managing Data for... Patron Privacy: Comprehensive Strategies for Libraries | ALA Store  About Us - LDH Consulting Services    Media mentioned Privacy Guides  https://gavialib.com/2022/06/stoning-goliath/ Library Loon article.  Karen Coyle - https://kcoyle.blogspot.com/2022/06/the-oclc-v-clarivate-dilemma.html  OCLC's non-profit status « The Thingology Blog Outsourcing the catalog department: A meditation inspired by the business and library literature - ScienceDirect SkyRiver vs. OCLC antitrust lawsuit – librarian.net

Transcript
Discussion (0)
Starting point is 00:00:00 We're doing like musical chairs with like who's showing up. I think neat city both turned off our VPNs. Yeah. Okay. Although I need to yell at all the people doing bad ops deck on Twitter about VPN. Yeah. I only have one of those. I have the express VPN just because they don't give me crap for torrenting.
Starting point is 00:00:23 Yeah, that's like the reason to have a VPN is if you're going to like torrent. My last VPN kicked me off. Yeah. They're based in the UK. Get Mulvad. It's like five bucks a month. Yeah, I have Mozilla and it's like five bucks a month. Five devices.
Starting point is 00:00:39 Moldad is really good and they're not in a Five Eyes country. And you don't have to have an email address or any sort of identifying information. You can even like mail them money if you don't want it tied to a credit card. It's very fun. M-U-L-L-V-A-D. It's the one that like privacy guides, they used to be private. Tools.io, but there was a split that happened there. But I get a lot of my Opsack info from privacy guides. That's how I learned how VPNs are kind of nonsense. Or if you're wanting to
Starting point is 00:01:12 watch UK Netflix or something. I just sound like crap all the time. That's how I watch this like Thai drama that I've been watching is I changed my VPN to Singapore. And it lets me watch it. I'm going to like cry. I got really good feedback. from my salon I did. Yay. That's the best feeling. I was so nervous because Violet was in there, and I'm like, I'm going to embarrass myself in front of Violet shit.
Starting point is 00:01:43 I'm going to show I don't know anything about metadata. It was fine, I think. This was the LogSec thing, right? No, this. So on Saturday, I did an inter-intellect salon, and they do old style, like, in the style, like French salons or someone picks a topic and then they kind of like lead a discussion on it or you could do workshops, those sorts of stuff. And I did one on description and power
Starting point is 00:02:10 looking at metadata, not just as like library metadata, but like sort of introducing this topic to like people outside of librarianship. You know, metadata is describing information objects and everything is an information object, including us. So like what's on your driver's license is metadata about you. Let's talk about that. Like the power. that might give people over you. Let's talk about how this can shape culture. Most of the people weren't library people and like really insightful shit there. Because like, you know, obviously I am very pro-us.
Starting point is 00:02:40 Like, you know, we do have authority and expertise in this. You know, we did go to school for it and all that. User-centered shouldn't necessarily mean that we don't use our expertise. However, wow, like, that doesn't mean we shouldn't be listening to like how people. who aren't metadata librarians are talking about and interpreting metadata, they just like normally don't get presented in it in a way that's like approachable or that makes sense to them unless they work in like business or software engineering where metadata, they might work with it there or their data engineers. I put quotes around that because if you remove metadata from
Starting point is 00:03:21 our titles, we suddenly make over $100,000 a year. Right? This is the joke there. But yeah, it was like really cool. It was like two hours and it could have gone longer. I just saw part of a soul of your body, Becky. Yeah. Whatever's left. Just left. At this point, I'm a husk.
Starting point is 00:03:44 So yeah, it was like really interesting to have like all these people from like various different backgrounds. There were a few of our library Twitter colleagues there. But all like talking about this and like bouncing off each other and getting like kind of philosophical and metaphysical with it. but also being very practical of like, but this helps us sometimes, like, how do we balance that? It was fucking cool. And I'm really proud of myself.
Starting point is 00:04:09 So it was fun. I learned a lot from it, actually. So 10 out of 10, we'd do it again. Oh, and before we begin. I was like, I hear a cat. You sounded auto-tinted from a distance. Do you have anything to say? Like, uh,
Starting point is 00:04:54 Yeah, it was sort of like, Meow. What's the cast name? Her name is Sophia. Sophia, hi, Sophia. Sophie for short, because she likes the sofa. Excellent. So she will be adding.
Starting point is 00:05:12 Welcome to the podcast, Sophie. She will be adding her commentary as we go along. So I don't have a mute button for her. She doesn't have a raised paw button. So she waits for no ban. She waits for no human. Maybe Arthur will show up and they can be friends. This podcast for your, for this episode, for this week's episode, it's mainly going to be cat sounds and just existential angst about souls leaving bodies after reminding ourselves that if we just take out certain words from our job titles, we're going to get paid a lot.
Starting point is 00:05:53 more. I thought the Beach Boys already made that album. This is now a PetSounds podcast. I'm Justin. I'm a Sculk-Com library. My pronouns are he and him. I'm C. I work IT at a public library. My pronouns are they then? I'm Jay. I'm a music library director and my pronouns are he, him. And we have a guest. Would you like to introduce yourself? Sorry, I forgot to mention that part. I am Becky. I am a library data privacy consultant in all-around troublemaker and troublesome caldager. And my pronouns are she-her. Welcome.
Starting point is 00:06:45 So we've got a lot, and ALA happened, and a lot of stuff happened to ALA, that really there was just not enough time to cover it all. Like, boy, do we have segment material. I did get to make a drop out of it. So here we go. Nice. Beautiful. Perfect. So the one thing I just wanted to bring up, there was a lot of stuff I wanted to bring up,
Starting point is 00:07:11 but the main one was a panel put on by, I believe the ALA group is called United Against Book Band, Unite Against Book Bands, which is an ALA project in which Nancy Pearl, the sort of rock star librarian of the group, said relatively upfront in the opening statement that even Holocaust denial books must be included. in the collection to which the other panelists who is not librarian is an author named Jason Reynolds just sort of had to go, uh, okay, as far as I know, I have not seen the full recording yet. So, um, you know, I'm working with limited information, but basically from what the original person who did the, the, who did decide to name the panelists said, I don't think he was on board with this, but couldn't back out. So one thing we talked about,
Starting point is 00:08:06 a while back was library Twitter pylons. And I said my concern there was one day they'll go after the wrong person. And that's what happened, which is a lot of people got really mad at Jason Reynolds instead of Nancy Pearl, who was the person who actually said the thing. To be fair, I did kind of follow a bunch of some threads. People were really mad at Nancy Pearl. She just never responded after her initial. This is what I meant, even though it was.
Starting point is 00:08:36 sounded like it was exactly what she said. Not actually clarifying statement, but there were a lot of comments on that being like, Nancy, no, or this is not right. And she just never responded. So instead, redirected to the person who was responding and was listening. And for context, Jason Reynolds is a black man and Nancy Pearl is a white woman. Also, a lot of people didn't know that Nancy Pearl was a librarian.
Starting point is 00:09:00 I saw on Twitter, people were like, who the hell is this? She's not even a librarian. I'm like, no, she's kind of very famously a librarian. and then that makes it worse. Yeah. Yeah. So for context, action and update, Nancy did post what she called an apology on Twitter,
Starting point is 00:09:16 which was a non-apology. She was the, I'm appalled of people thought I said include Holocaust denial literature. Well, she not only said that based on the reports from people who are live tweeting in the room, she also has a history of saying this. And there was a tweet pointing this out in an article,
Starting point is 00:09:35 like 2017, 2018, I believe after Charlottesville. She mentioned something akin to what she said at this panel. So she's got history. Yeah, I had to explain who Nancy Pearl is. Is like, do you know the shushing librarian action figure? Yeah, that's her. I also can't escape Nancy because Nancy was a former Seattle Public Library employee, so she's all over.
Starting point is 00:09:57 I was that to say she was a Seattle Public Library in person for a while. You know, I do have the deluxe. Nancy Pearl Library set that actually has the library backdrop and the plastic cart. I think I'm just going to keep the plastic cart right now, the plastic book cart. I think that's the most valuable thing in that kit. You might need to do some like Lego cataloging or something or like little miniatures or something. Why get rid of a book cart? Yeah.
Starting point is 00:10:24 Yeah. But I mean, it did force people to reckon with, well, a lot of things. It was incredibly hard to follow where everyone. one was taking the conversation, but a lot of how do you do callouts? Is there a better way of doing it that provides contextual information in the original callout? But I think the original poster was of the opinion that he agreed a lot. So I felt like I had to put it in that way, which, you know, I can't know that I wouldn't have fucked up in the same way. Yeah, because it's like, you know, there is a power difference there to some extent.
Starting point is 00:11:04 He might also agree, I don't know. I haven't watched it yet. So I doubt he did. So, yeah, that was Alien again. We've got a lot to talk about, but Becky, you did want to bring up something quickly about privacy and the row overturning because of SB 8, and that's relevant to me because I'm in Texas, which focuses to vague definitions of AIDS and the bets,
Starting point is 00:11:33 providing access. And of course, in Texas, this means any private citizen can bring lawsuit, which means they could be sued for providing information, which is also incredibly vague, because we probably have books about abortion providers somewhere. Yeah, and even just going to the library and doing a search, an internet search on a pop a computer, that, again, SB 8 and a lot of copycat laws that were finding, popping up throughout the country, might interpret that as aiding in a bit. bedding, seeking information. And so not many people know off the bat that privacy is not
Starting point is 00:12:15 constitutional right. And the basis of why Roe being overturned is so scary as a privacy person, why library workers should be scared, is that what we find in the majority opinions and the concurring opinions, particularly with Thomas, that particular right to privacy that has been, inferred by Roe and a whole slew of case law throughout the years through the Supreme Court has been invalidated. And so with Thomas's opinion in particular, he wants to go back to other court cases that have been decided based on that same interpretation of intimate privacy. So we have contraceptives, we have same-sex marriage. There's going to be a lot more down that line. And for example, cats versus United States, which gives us the reasonable expectation of privacy in the public setting.
Starting point is 00:13:13 And a good portion of ALA intellectual freedom materials and privacy materials references that law as a way to build your legal case to provide privacy at the library. And so at this point, I don't know what else to do beyond screaming into the void, beyond just saying if you haven't already locked down your computers in terms of making sure nothing's logged, refresh the image every time there's a user session that ends, sure up your privacy and security practices. It's going to be a bumpy ride. Yep.
Starting point is 00:13:45 Some states do have stronger privacy laws that their state constitutions have been found to hold. So this is why Florida doesn't have an immediate abortion ban. but Florida can also amend its constitution pretty easily. So I did have to send out some emails to faculty I work with who do oral histories and say, you need to start changing your training to cover if people are talking about abortion or, and I included trends affirming health care because in Texas, that's almost certainly going to happen immediately. There is a really good article that I just talked about in that.
Starting point is 00:14:26 that one presentation I gave back in May, it's called testimonies, the cost of sharing their voice or something. And it's several examples of like archives and special collections doing the oral histories basically and like the legal and privacy ramifications of it and what they did in those cases. This is how I learned about the Belfast project. But in it is also a collection of oral histories around the time of when Roe v. Wade was first put into act and what this library did for like redacting information like Becky, I would love to see what you, if you ever read it, like what you take of what they did. But I thought it was very good because it leaves behind no data that could be collected for like legal searches or anything. Like they even like removed the
Starting point is 00:15:22 original cataloging records and whatnot for everything. They redacted everything about three times. They would pass it among the different staff members to make sure they hadn't missed things. I'll find the title of it. I first heard of it based off of a presentation I saw at DLF back in 2018 in Vegas. And then it got turned into a journal article. So let me find that real quick. Yeah, that gets into a greater conversation about library workers as information fiduciaries, making sure that we uphold our end of using our patron data to the patrons benefit and not handle it or use it in a way that ultimately harms them. Now, some library state laws put you at a better position to protect patron privacies, but others, not so much. So going back to Florida, the library confidentiality and privacy law gives explicit permission
Starting point is 00:16:25 for parents and guardians to access their minor child's library records up to a certain age. And so for bills that are targeting LGBTIQIA plus materials, libraries will need to figure out ways to protect their minor patrons in the same, well, at the same time, having to uphold or deciding to uphold, they can decide not to uphold, how they're going to respond to that requirement. Yeah, that that reminds me was actually just reading a Washington Post article just yesterday. Well, actually it was like four this morning because I couldn't sleep and this didn't help of conservatives going after like gay straight alliance groups at schools and there was an instance here in Marysville, like in Washington and Marysville where a
Starting point is 00:17:18 safe space after-school club at a elementary school got shut down and canceled because of these. And it's all sort of the same thread. Like I have a right to know exactly what my kid is up to at all times sort of authoritarianism, which I mean, if it's happening in schools, it's going to happen in other libraries, because that means it's happening in a school library, which means it's going to filter out to public libraries. things I lose sleepover and then lose more sleepover because I stay up reading things I shouldn't read it for in the morning. I know that ALA is working on getting some guidelines and just actions out and securing Wi-Fi securing computers and whatnot. And I started going down the
Starting point is 00:18:04 rabbit hole of, well, what happens? You know, you can secure your computers, making sure they're not tracking users. But if your security camera, which is being maintained and the footage, being archived at the police department is pointing towards those computers, you have a problem. Another problem is those mobile surveillance cameras that we see pop up in various neighborhoods, you know, suddenly from out of nowhere the next day they're there. So worst case scenario, they could be used to track people who are around the area of the physical library. So there's a lot of ways users can get tracked, even if the library hardens their security and privacy protections.
Starting point is 00:18:46 Yeah, the Crime Stoppers Towers is a lot of them where I live. And they just park them in like crazy places. Like in my apartment complex, they just put one half on the sidewalk, half in the middle of the road. And it was just there for like, I don't know, a month. But yeah, just right in the middle of everyone's houses. So they are very common. So we are going to talk about our main topic today, which is
Starting point is 00:19:13 the OCLC is suing Clarivate. Should get law and order or something. Everyone likes that better. My favorite law, like my favorite, was when someone did tainted love. It's like, now I know I've got you. And it's just the law and order, like, SUV. OCLC has filed suit against
Starting point is 00:19:43 Clarivate for their product called Metador, which we'll get into how it works, but basically OCLC is saying those records are proprietary information and that their contracts with libraries mean that those libraries cannot hand over those records to Medador because what Medador does is takes those records from your ILS and sort of indexes them and allows other people to look through it and find out where, what your local holdings are, rather than a centralized database, which is how WorldCat works. That's the long and the short of it. Is this one of those like whoever wins, we lose situation? Yes. Because it still confuses me. OCLC sells themselves as a member cooperative. They are not a member cooperative. They are
Starting point is 00:20:34 for a profit business that claims to be a nonprofit business, but in reality, they had to change the law so they can keep their nonprofit status. But no, the library Loon wrote two very good articles dressing down OCLC's practices and essentially charging catalogers and metadata workers for their labor, which is essentially put stuff into a centralized database, bibliographic data, holdings, whatnot, and then sell it right back to the libraries for additional products. And the library Loon is hoping that OCLC loses, but at the same, on the other side of the coin, Clarivate has so much of a market share in the ILS markets. So essentially, we have two monopolies in their respective areas, OCLC being bibliographic data, Clarivate being almost everything else. So having Clarivate be
Starting point is 00:21:32 win this battle, it would be interesting to see how the courts rule over data ownership, If they say that bibliographic data cannot be licensed or owned, then I think that might be a favorable outcome, but I have a feeling that this case is going to go towards licensing and how those records can be licensed and how that licensing can restrict, further restrict any possibility of sharing the work that we do as library workers and how how that licensing can restrict further restrict any possibility of sharing the work that we do as library workers, and how these monopolies can monetize that work even further. So I've seen people cheer on Clearavates. I've seen people cheer on OCLC because even though I give a lot of shit to OCLC,
Starting point is 00:22:22 there are a lot of libraries that would not be able to do the work that they are able to do for information, access and retrieval at their libraries to do cataloguing of any sort if they didn't have access to the records in OCLC, didn't have access to WorldC. So OCLC, again, being the monopoly that it is, serves a vital role. Even though there's been reduced membership and use of WorldCat, it still serves a vital role to find records that are not too crappy in terms of having vendors to ship you stubs or having LC records that are stubs. Eventually, you'll find records in WorldCat that have been enhanced, and there are a lot
Starting point is 00:23:06 of libraries out there that do not have the ability to spend that time to enhance the bibliographic record to be able to provide better access of those materials for the library patrons. And so that's the very long way of answering most likely libraries are going to lose. If nothing else, our subscription prices and fees are going to raise because of all the legal fees. Yeah, I was at an institution that was an early adopter for WMS. And that was sort of my most day-to-day interactions with cataloging because I did some connection stuff in grad school, but it really didn't, I didn't do enough of it to really understand how the system was working. Whereas for us using WMS, WorldCat was just everything. It was our discovery layer. It was where we pulled our records.
Starting point is 00:23:57 If we had to make modifications, it was pretty straightforward. We didn't have to use connection to push records. And since we were beta testers, we had a pretty good deal. because we were a small library. So, yeah, I mean, tons of people would have a hard time doing this work if OCLC didn't exist. And the library wouldn't pointed this out, but there are a lot of parallels to scholarly communications in terms of why things. One, how both publishing and cataloging record retention are services, but also how those things had to consolidate at a certain point in time because of it, the explosion and volume.
Starting point is 00:24:36 So scientific publishing had to consolidate in the 50s and 60s. And OCLC also had to consolidate because there was with other, I can't remember the history perfectly, but basically OCLC ate up two of its early competitors, Washington, something, is just long before my time. Yeah, RLN and W, yeah, Washington, I forgot the name. So, I mean, it makes perfect sense, but now it's possibly time to start thinking about other ways of doing this, which is kind of what Clarabate's going for. And I think that's where both Library Loon and Karen Coyle, who I pulled some stuff from their blog post, which is basically talking about the difference between a centralized repository of records versus pulling straight from the ILSs with Medidor and doing stuff a more peer-to-peer. indexing, even though that peer-to-peer indexing is mediated by a for-profit monopoly or duopoly. So, you know, it's not great, but it could be technologically a step forward.
Starting point is 00:25:49 Technologically, it could be a step forward, but I think Karen also points out that the way that Medidor is structured, there could be a lot of, it could be a technological step forward, but in the terms of quality and the way that catalogers and metadata workers can do their work, it might be a step a little step or a step backward because if you're not having a centralized database, you're going to be dealing with a lot of duplicates. You're going to be dealing with a lot of records that are so localized that you might end up doing a lot more work editing that record, then you would have, if you would just went to connection, downloaded the master record, and then edit it that record to have your local information in there and send it off on its
Starting point is 00:26:37 merry way into your ILS. So what Medador is doing, I can see the way why they put it up that way, because again, this is speculation. They didn't want to appear that they're directly competing with OCLC's WorldCat. They say, oh, let's do a decentralized. version that should not be a one-to-one direct competitor but OCLC's feeling threatened enough
Starting point is 00:27:04 that they are trying to shut this down at the beginning stages of the development and the way that I'm reading the documents from OCLC's suit, I don't know if they actually have
Starting point is 00:27:21 really solid evidence that WorldCat records are in Metador already. They're doing, I think the discovery phase is going to be a lot of, it's going to be a fishing expedition. The fact that OCLC was able to get a temporary restraining order against Clarivate from Clarivate talking to their own customers about Metador
Starting point is 00:27:42 tells me that the judge who is presiding this case is going to be very friendly to anything that OCLC says. So this might be me being like tinfoil hating. But the whole, like, it being called meta door, and I know it's meaning metadata, but then it's like, oh, we're going to do decentralized. And I'm like, is this about to move into some like blockchain nonsense? Because that's the new hotness. I'm like, you know, because they're not doing like bit torrenting and like cool, like nerd shit. They're not doing, this isn't some nerd shit.
Starting point is 00:28:22 Is this slowly moving towards like what if we did metadata on the blockchain? Oh, God. Is that what's about to happen here? Okay. So this ties into a conversation I've had with many catalogers in the past about wanting a history of who edited the records in WorldCats. So you can sue them for how bad they are. Because right now you can only see the three symbol, three letter OC, a little. C symbol, but I had conversations with catalogers, and this has been a long, this is a tangent,
Starting point is 00:28:57 but this is a long, long grief of many catalogers wanting to know who specifically changed that record in Rural Cat that made it that much worse. But that's a tangent. But blockchain, oh, God, I'm not sure if Clarabite is doing anything in that space. And I don't know if this would be this, if this would be the product where they would branch out into it. I think more likely, since Claire Bay is a data analytics and intelligence company at heart. I think my tinfoil hat comes into play when they start finding ways to monitor and surveil cal logging work in the system itself. Because cal logging has been traditionally historically surveilled in terms of cost efficiency analysis, seeing how much it takes to catalog a particular item,
Starting point is 00:29:55 what's the time between receiving it and getting it onto the shelf? I can see Clarivate developing analytics tools that can measure the productivity of people who are doing metadata work and cataloging work for managers to then run reports, run a cost analysis, and then quote-unquote optimize the cataloging workflow. Any mention of productivity makes me want to puke, just putting that out there. Yeah. I remember having this, like, and I've had this thought this whole time.
Starting point is 00:30:29 I've been in libraries, but it's this like libraries are becoming a business. And I hate that about it. Like everything is, it's, we're taking from business to like, yeah, optimize and whatnot. And it's just moving more and more towards capitalist bullshit. And libraries aren't a business. their service and it drives me nuts. But yeah, I can see, Becky, I can completely see that happening. Just, given the history of outsourcing cataloging and outsourcing technical services work, I do not put a pass, Clarivate, to come up with a productivity worker surveillance tool that you can
Starting point is 00:31:13 then run the cost-benefit analysis on how your technical services department is running. And oh, by the way, Clairebate has this wonderful new product that you can buy enhanced metadata records. I've mentioned this before, but when I was in the tech services interest group for the Florida Library Association, I said to someone who was at a big library, they're going after catalogers. There might not be catalogers at your institution forever. And this person kind of blew me off. And I was like, there are tons of small libraries that already don't have anyone doing any of this work and it's all vendor provided and it's all, you know, we had WMS, so we had a little bit of control over our records, but not a whole lot. Most of it was just big imports from knowledge
Starting point is 00:31:58 base collections and those were a good service, but, you know, if the vendor was shit at putting the metadata in the knowledge base, then the records were shit. That was a conversation I had to have with faculty and even like tech services staff. and whatnot when I was at UNH when Prima wasn't behaving the way we were told it should behave, was that like there's only so much I can do with the metadata that we're given and can't even change because it's through the discovery layer and not our records, right? Like, there's only so much you can do with crappy metadata that you don't have any control over. And it's funny when I read the historical literature about outsourcing cow logging. There's this one article in particular
Starting point is 00:32:46 by Dunkel, 1996, talking about outsourcing the catalog department. And Dunkel actually puts down two reasons. They cost too much and they're troublesome. That may or may not be the origin story of the troublesome catalog or moniker. But I like it when Dunkel said, made the observation while reading Wright State University libraries director. For folks who do not know their call logging history, right state was one of the more infamous libraries that outsourced their entire cal logging operation to vendors in the 1990s. They got rid of all the call loggers. There wasn't keeping a cataloger on board. They just got rid of
Starting point is 00:33:32 all them. And the observation was cal logging was not a core service, was not a core service, was not a core activity. But the output of cal logging is a core activity. So that separates the people doing the work from the actual product of the work, which if you don't have the people who are doing work, how are you going to get that people, how are you going to get that output? And it's just baffling to see the administrator from Wright State just talk about this and not getting the disconnect between you need that activity of cataloging, the act of cal logging, the act of calologing. The act of cataloging in order to get that output for your public services library, for your public service library workers, and for your catalog and your discovery layer and almost everything else that
Starting point is 00:34:21 relies on metadata to work. And that sort of gets into the historical devaluing of technical services as part of the library profession, which that could be a long tangent. Yeah. As someone who worked in public services before moving in IT, the whole technical services collection management bubble has always just been sort of opaque in a magical way. Like, I know I can't, I know how to look at records and I know what tags are and stuff, but it just seems like it's an awful lot of knowledge to need to have it to work. So, yeah, that sort of devaluation is baffling to me as a person who does not do cataloging work. And I figure if even I can very obviously see the value in it, like why can't these
Starting point is 00:35:10 people who run libraries see it as well. It's all collaborative now. It's all vendor records. We don't have to make our own anymore. Why do we need original catalogers if the records already exist? It's all just batch loading now, right? Yeah. And then you remove your whole staff, and that's exactly what it turns into.
Starting point is 00:35:36 Yeah, I think there's some stuff in here about, uh, cataloging labor costs and high cataloging costs are always focused on labor costs. So I think there's a definite ideological point being made when your catalogers tell you these records aren't going to work and then implementing them anyway, which I believe is what Harvard did. They just had five of their copy catalogers resign, not just recently. I don't know when this article was, but that was what happened was the copy catalogers resigned. And then the library admin moved forward with the new outsourcing, the new flow process, because basically they had a reorg. So it caused a lot of problems in that departments were, I guess, across different areas.
Starting point is 00:36:25 And so they had different area managers rather than all the catalogers are reporting the same person. So like a matrix model kind of thing? I'm not sure what you would call that. I know it just ends up with a lot of problems. That's why we legacy institutions merged. We had sort of like, this is the collection development person for science on this campus. And there's the person on this campus. And we finally just did away with that.
Starting point is 00:36:51 But, you know, it was a process of sort of centralizing and not centralizing. There was a lot of politics in it in terms of making sure no one felt neglected, either particular campus. And then we all went remote and it didn't matter anyways. So it was like, okay, you know, we could have been doing this the whole time. But my university is very old fashioned in some ways. But also is always resolutely working class. So, you know, I can imagine like if Alma was able to say, like, here's the data on how long it takes to catalog this record and get it to the shelf. I think everyone would have the good sense to just not run that report.
Starting point is 00:37:29 It'd be like, no, I'm not going to mention that Alma can do that to the dean. So it's not a miller. class solidarity here, but people just know when to cover their own ass. So, you know, it's more of a common sense class solidarity. If I get lucky, maybe I can get us all unionized someday. Unions in Texas are hard. There's something in here about collaborative cataloging and why Library of Congress is still not the answer. Could you explain that a little bit? So when we were talking about cataloging and alternatives of WorldCat, a lot of people look towards Library of Congress or Elsie, with the misconception that Library of Congress is a
Starting point is 00:38:12 National Library. It is not. Unlike many other national libraries you will find in Europe, Asia, Africa, and elsewhere, Library Congress, primary focus is serving Congress. They might have a hand in setting particular cataloging standards. They also do, you know, Library of Congress subject headings, which, again, primary focus, is the U.S. Congress. The fact that the Library of Congress is part of the federal government and does not have the stability that you would find in other places in terms of funding, how many times has Library Congress shut down because of federal bills not being passed, bills aren't being paid, vital infrastructure, the websites that library workers use for cataloging gets shut down as well. So basically everything's to a
Starting point is 00:39:10 stencil until Congress decides to pay the library. So we have that budget issue. Another issue is a technical issue. Historically, Library Congress has not been very strong in terms of technological infrastructure, thanks to various past librarian of Congress's decisions. But there have been some turnaround with IT. They're doing some catch-up on the actual infrastructure itself, but if they're going to try and replace WorldCats, I don't know if they're going to be able to do it just flat out technologically very soon. It will have to have a period of stability.
Starting point is 00:39:53 Developers, they probably do have the developers to do it. It's just having that stability in terms of what happens in D.C. to be able to do that. Yeah, plus the busy box. bodies in Congress tend to get upset about things once in a while, so they can also directly force the Library of Congress to do certain things, like change subject headings, or, so I don't know. Plus, Library of Congress has something like 20 million records. OCL has like 500 million, so there's a difference of scale in terms of if they were legally allowed to just take all of
Starting point is 00:40:24 those CLCLC's records, of course, that would be a favorable outcome of this case, but I doubt that's what's going to happen. But again, I don't know if I would trust it to them. I think there's probably some other outcomes, which brings us to the report of the ICOLC OCL task force, which brought up some things that I wasn't aware of, for instance, that like 50% of the holdings in the underlying WorldCat catalog are displayed on WorldCat.org. I thought all of them were, but that comes from working with WMS. So I knew that our records were going straight into WorldCat.org. so we could use it as a discovery tool because all of our local records would be in there. I guess some of the questions from the report were what happens long term if OCLC's WorldCat loses its relevance.
Starting point is 00:41:16 You mentioned already that some of the subscriber count is already going down. But what does that mean for libraries if the records continue to degrade? That's a good question because given the changing landscape, of cataloging and metadata work and the increased reliance on vendor services and vendor records. One of the thing that WorldCat does in terms of monetizing the bibliographic data work that gets done by libraries is the products that are built on top of it. So you have interlibrary loan, for example, is a huge one. And to have a way to provide interlibrary loan services on scale without OCLBORILLLLLNs, see being the main player. There are other organizations that do use other systems for interlibrary
Starting point is 00:42:10 loan, but they somehow are connected back to Welkett, if I remember that correctly. So for... Yeah, because doesn't X Libris have one in, like, Alma? I'm not too familiar with the X, with Alma's ILLL capabilities, and I'm not sure if there's any tieback to, because Triple I has a in reach. But again, where are those records coming from? Where are those holdings coming from? So we then go back to a decentralized. It's not decentralized. We go back into smaller silos. If we're now going back to separate library service platform systems to rely on holding, holding information for ILL for physical items. Right. And the reason you have that three code item in your your cataloguing record, your OCLC sort of identifiers, that's also tied to your
Starting point is 00:43:06 physical space. So that tells you like how far away is this record because this local holding record is by ours was like H-O-8. So H-O-8 tells you, okay, this is in Naples, Florida. So that's where the physical item is. I can't imagine how ILL would work without WorldCat, quite honestly. but we'll get to infrastructure in a second. Because sometimes centralization can be good. Yeah. Yeah. I mean, decentralization is not always the answer.
Starting point is 00:43:39 I mean, for, again, for a lot of libraries who don't have the labor to put into copy cataloging, getting records that actually are fairly decent from WorldCat is about their best way of providing the best level of access to those resources. for their patrons. I want to bring it back to OCLC, though, because this is not the first time OCLC has sued another ILS company. There was a Sky River lawsuit, which is something I was not familiar with. Could you run us through kind of what happened with SkyRiver? All right. So for those folks who are not familiar with their OCLC litigation history, this is actually a case where OCLC was getting sued by Sky River. Sky River was poised to be a direct competitor to WorldCat. Now, Sky River at first was a separate company, but it was closely tied to Triple I. So it was
Starting point is 00:44:38 fairly, it was an open secret that Sky River was essentially a triple I product. And Sky River was claiming that OCLC had a unfair advantage, market advantage. But eventually the lawsuit was dropped after AAA decided to absorb SkyRiver into the core business, which was kind of interesting because a lot of people were thinking that SkyRiver had a pretty good case considering that OCLC was the only game in town. And particularly it was coming on the heels of OCL's attempt to grab the copyright of the record of the bibliographic records in their database. But again, that lawsuit was dropped. So we wouldn't even know how it would have been settled at that point. Now, I do remember that OCLC also was
Starting point is 00:45:36 eyeing library thing for possible litigation because library thing was offering records that could or could not. be quote unquote world cap records. And so that gets us into question of what is a WorldCat record versus what is a record that was created by a individual library that just so happens to subscribe to WorldCat. Yeah, who is the author and when did you decide that, you know, copyright was transferred if that's not in your user agreement. Like every website now knows to say, if you post something on our website, you're giving us a royalty-free forever, like to use your work, if not just giving us copyright over the work entirely. But did OCLC know to do that with Mark Records? And how would you prove it? We talked about this provenance problem.
Starting point is 00:46:27 Yeah. So the copyright status of Mark Records and bibliographic records is a big question mark. And that's one of the reasons why some people want the current lawsuit to go to trial, because they want to have that conversation about data ownership. A Mark record, A bibliographic record can be viewed in a couple of ways. So you have information in the record that talks about the description of the item that you have on hand, how many pages it is. What is the title? Who created this particular work that is either hand or is in a particular database? So those can be construed as facts, and facts cannot be copyrighted.
Starting point is 00:47:11 Now, when we get into... No copyright law in the universe is going to stop me. Perfect. But then we get to your notes, your local notes, the subject headings, stuff that you can argue, take some creativity and creative judgment to put into that record. Does that constitute copyrightable material? So we have that. It sure can.
Starting point is 00:47:39 So we got this... So we got this mixture of facts and not feelings, but opinions in this record, opinions that are masquerading as facts. So you have that mark record. And then you put it into a database. Now, in the past, copyright law has around databases have centered around sweat of the brow. So essentially the collection, the work of putting into that database can allow the creator of, of that database, the owner of that database, copyright status over that database. That has since been struck down. That's, that's been a long time since that's been struck down. Now you're
Starting point is 00:48:20 getting into the arrangement of the materials. That's what OCLC was thinking, I think, during the 2008 licensing policy change. OCLC in this current lawsuit is claiming, I'm not sure if they're claiming copyright or license are over the enhancements made to the WorldCat record that may or may not be in the individual libraries database. So part of what we have here is OCLC is now saying that their enhancements are copyrightable. And Medador should not have these enhancements in there because there's a contractual agreement that this information is not going to be shared. With the current suit, what I'm understanding OCLC is saying is basically giving any of these records as a breach of contract, not because of their originality or enhancements, but just because they're OCLC records.
Starting point is 00:49:21 And they are OCL records by nature of the fact that they have an OCLN. Is that the right word I'm looking for, right? Accordem. It's been a while now. They have a control number, OCN, right? OCLC control number OCN. And that indicates sort of their relationship to OCLC. right? I have to look this up again, but I think that the presence of a OCLC control number
Starting point is 00:49:47 doesn't necessarily necessitate that it's a World Cup record. I have to double check that, though. I mean, I would definitely say it doesn't, but I would also say like that I think that was, I felt like someone mentioned that that was part of the argument. Maybe it was in Karen's piece. We've gone for a while now. Let's start asking the action-oriented questions. in terms of what got this whole thing going, which is how can libraries take ownership of the catalog and like what kind of infrastructure would we have to imagine for it to do that? This is where we get to get speculative and actually imagine how we want things to be instead of how we constantly are iterating them towards not being as horrible as they could be.
Starting point is 00:50:30 Say your cat. I am so sorry. Clopping. Or is he throwing up? I am so sorry. Let me mute. So that was Sophia's commentary about how we could take the catalog back. It's going to be a mess because OCLC is not going to go down with a fight.
Starting point is 00:50:56 And there are still a law libraries that still believe that OCLC is a cooperative. And even if they have misgivings about OCLC's actions around making them pay to put records into the system that then they would have to pay again to access in addition to other services. OCLC has the scale that scale and reach that many alternatives may not have. So a part of it is, again, trying to figure out the infrastructure so we can think about how can we scale not only on a national level, a U.S. level, but how can scale on an international level that OCLC has right now. May I ask a question?
Starting point is 00:51:46 This is going to sound silly considering I used to be a medited librarian. I'm not allowed metadata librarian anymore. It's so fucking weird. But I have mainly not done Mark cataloging. If an institution were to say stop subscribing to OCLC, like stop using WordCat, like move on to an alternative service or whatever, all of their local records and stuff, like, do they get, do they just have to redo their entire catalog if they stop paying OCLC?
Starting point is 00:52:17 Because that is also a thing with infrastructure we need to think about is how do we support libraries who are thinking about making a switch like this? You notice that in the lawsuit, OCLC is not suing the customers who are theoretically going to put the records in Medador. You don't sue the paying customers. So if OCL were to go after a library who has decided, well, actually, there are libraries who have released their mark records. I think Harvard's one of them where they just say our individual library database is CCO, go forth and use. So it's already been done.
Starting point is 00:52:59 Thanks, Kyle. It's probably, it's probably some of his doing. So it's already been done, but at the same time, I doubt that OCLC is going to be quiet if there seems to be a movement among libraries themselves to create an alternative. Again, this puts OCLC in a particular vine. They can't sue individual libraries. That is not a good look. Again, they chose to sue Clarivate instead of the paying customers. Yeah, the paying customers who are the ones who are supposedly violating their use contracts, right?
Starting point is 00:53:39 Precisely. Yeah. So maybe the trick is to have a bunch of libraries who are still subscribing and still using mark records, start to build an independent one, not using the records. I'm just trying to think, because if you bail on OCLC and then immediately start doing something, like, that's just grounds for getting sued again, right? It just means you're small and they're just going to come in and school. sweep you up. And I mean, they're trying to, they're trying to challenge Clarivate of all fucking
Starting point is 00:54:07 companies, right? What are they going to do to a small collective, you know, a small collection of libraries who are trying to build something. I'm just trying to think of how to use that to an advantage. Don't sue the customer. Even if the customer is trying to build something that is a direction competition to what you do, I don't know. Yeah, legal action is probably one, it's only, it's only one tool in the toolbox. I have a feeling that OCLC would then not go the legal route at first, but go into the, lean into the member cooperative fairy tale that they keep propagating that even though decades, that that has died decades ago. So they could possibly say, we know you are dissatisfied. Here are some things that we can do. But as a reminder,
Starting point is 00:54:56 if you really want to go this route, here's all the labor you have to do. And nothing scares in the administrator more than giving them a big number of work that has to be done, of the cost of cataloging that has to be done. Because, you know, cal logging is a cost center of a cost center. It's a meta-cost center. I mean, I imagine there's, the infrastructure is so tricky, but luckily the protocols are all open and well understood. The records are more or less out there and could be cleaned. And I mean, people have done some amazing stuff that I never thought. would be possible as quickly as it happened, like what the hour research formerly on paywall people
Starting point is 00:55:38 have been able to do by taking Microsoft academic graph and turning it into a free database that anyone can API records from. Like, you know, it's not done yet, and it needs to be cleaned up and fixed. And it's still not as good as Google Scholar in terms of researcher use, but in terms of being able to pull these records with RORs attached to them,
Starting point is 00:56:00 so you know that which publication happened, which university, and you can do that for free, that's extremely powerful and it's extreme challenge too, especially like Elsevier. Yeah, definitely. And one alternative that we already have out there, which has the possibility of scaling, is open library. Open library has been out there. There have been library workers who have been working and getting metadata in their collections and whatnot. So we have some infrastructure there. It's just a matter of, could Could projects and services like Open Library be the new home, the new centralized home? So we talk a lot about decentralized versus centralized and some of the potential downfalls of doing the decentralized route.
Starting point is 00:56:49 Because cal logging needs a certain level of quality control for it to be effective. And if you don't have authority work, if you don't have people reconciling duplicate entries, Even different capitalization between two records can make a system say you have two different separate works when you actually just have a single work in your hands. I think the infrastructure in terms of scaling on a technical level, I think, might be easier. As you said, the protocols are open. Who's going to fund it? And who is going to be the maintainers?
Starting point is 00:57:28 Because that's one of the things that OCLC provides is people who are. enhance these records. And as with any collaborative or cooperative project or organization, figure out how much organizations are going to pay what, who's going to provide the people to do the enhancements and whatnot. And that gets tricky really fast when you consider that many cataloging and metadata departments have been decimated due to outsourcing and layoffs. Yeah. It can be done. A lot of catalogers and metadata folks, I find sometimes end up working for vendors.
Starting point is 00:58:10 And so this could be, it's like the vendors themselves, I was about to say, do we do the deal with the devil? But you have to realize that the individuals that are working in these vendors are not the devils themselves. It's just the administration that they're working for. Do we find opportunities of collaboration between people who have those skills may not be working, libraries anymore, but find ways where vendors can support this cooperative project that is not centralized under one vendor. But instead, it's a cost share between libraries and vendors. It's not going to be OCLC masquerading as a member cooperative, but instead a different type of collaborative organization. Yeah. I mean, the problem would be maintaining revenue streams for,
Starting point is 00:59:01 continuing operations. So anything that you created, so like you mentioned using, if we were to use like open library, open library currently isn't really monetized in any way. It takes its money from donations and the internet archive business that does web archiving for government and business and stuff like that. So if you were to start relying on it, I think it would have to start justifying some the additional costs by creating a product. And I worry about what that product might be. Although, So if they could sell a product that allowed libraries to do controlled digital lending easily and they provided a service for it, I can see a lot of libraries buying that. But they also might not exist after they get sued by all the big publishers.
Starting point is 00:59:46 So because they screwed up. I don't trust the Internet Archive all that much. I've been in too many meetings with them where I'm just like, I'm getting bad vibes, man. I don't know. Something about this is off. And that's one of the drawbacks of overly relying on. a major organization. And so my experience with working in Ohio Link, you get over 80 member libraries in Ohio Link, but honestly, the main member who makes the decisions is the TM Ohio State.
Starting point is 01:00:17 And so the organizations that have the money and they have the staff and they have the capital, the political capital in any type of situation where you're supposed to be doing a collaboration. This is something that any cataloguing collaborative project will have to navigate in terms of if your larger institutions start to have cold feet or decide that they don't want to bear the brunt of the cost, simply because they're a larger institution and the smaller institutions can't pitch in more. I wonder if there's an avenue here to utilize, like, state libraries. more because I know that, you know, they're often incredibly underfunded and, you know, don't have the financial stability, much like Library of Congress, like doesn't. But it makes me wonder if, like, yeah, using that sort of combination of big, big influential library systems combined with state library could, and it's kind of the same thing with,
Starting point is 01:01:26 with, like, privacy concerns, like, how do we put pressure on vendors to do privacy, like better. How do we get that combined like power enough to actually make them stop and think? And my brain always goes back to like state libraries because you can, it's small enough that I don't know. I'm not thinking very well today. So or like state library associations. I mean, it's it's hard to get your head around this question is essentially. how do we create something that allows libraries to control their work and their labor value? In a system where, again, it's expected that if you build a system like this, you have to have a revenue stream. And quite honestly, any attempt to create a revenue stream of a collaborative cataloging project is just going to,
Starting point is 01:02:30 most likely fall into the same pitfalls as OCLC. They either have to monetize because the infrastructure is too costly and they need to pay people to enhance those records, or they see dollar signs flash across their eyes because capitalism. Yeah. I mean, if you, if you were to create an alternative to OCLC, it would more or less end up just looking like OCLC probably go through. all the stages of its development much quicker, too, considering, you know, you never know who's
Starting point is 01:03:05 going to get into a leadership position there. But there should be some model we can figure out. I mean, we've come up with more clever models in terms of things like subscribe to open for open access journals and things that people said would never work that are working. So, you know, there's still early days, but they're possible. So we've gone for a while. Any closing thoughts? Anything you want people to check out that you're working on, or do you want people to leave you alone? Oh, goodness. I suppose I should probably do the shameless plug for my new book that was just published a couple of weeks ago. So if you haven't already looked on the LDH Consulting Services website or my Twitter account at YoBJ, you might have noticed a new book about
Starting point is 01:03:58 patron privacy. So the book title is Managing Data for Patron Privacy. I co-wrote it with Kristen Bridie. Excellent librarian down at Caltech. We are both badgers. So we are celebrating the publication of our book with the good cheese. You can find me at Yo underscore BJ on Twitter. I promise I will treat a happy tweet once in a great while amongst all the dumpster fires that are happening with privacy in this world. Yeah, but I think that's about it. So thanks so much, Becky, for coming on. Thank you.
Starting point is 01:04:35 Good night.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.