Wednesday, August 8, 2007

User-generated e-publishing content and trust: a conversation with Geoffrey Bilder

This is the conclusion of my London blog and my e-publishing course. This blog started out as a journal of my experience, but very rapidly it morphed into something else. I quickly realized that it was very time-consuming and required a huge effort. For these very reasons I thought that so much work should not benefit only me, and that this blog could become a useful resource and tool for my fellow students, for my library school (Pratt Institute's School of Information and Library Science) and for future students who are thinking of taking the summer school program in London and want to know more about what we do there. And who knows -- it could be useful to other librarians, e-publishers and information professionals.


As a future information professional, I would like readers to trust the content in this blog, and when Geoffrey Bilder spoke at the conference and began addressing the issue of trust in "informal" online publishing tools (like blogs and wikis), this topic resonated strongly with me. I have already written about his presentation and about our subsequent correspondence and his generous agreement to help me with the final part of my project. So, without further ado, let's get to the meat of the matter, as it were.

= = = = = # # # # # = = = = =

Ever since choosing this topic, I have been formulating my own definitions of the main terms in my project.

  1. What do I mean by user?
  2. What do I mean by publishing?
  3. What do I mean by user-generated e-content, in other words, user-generated e-publishing?

I also would like to introduce another key concept for my work, which is that of TRUST. Trust is crucial to what I want to talk about. So, let us start with the definitions:

USER : By user I intend an individual who may or may not be formally associated with a professional or scholarly institution, who possesses expertise and/or qualifications in a certain discipline or field of knowledge, but who is acting on his/her own behalf as an author, outside of his/her professional capacity.

PUBLISHING : The issuing of content to the public for free or for sale in any medium (print, electronic, audio, video, multimedia), preceded by an editorial process that lends authority to the final product.

USER-GENERATED E-CONTENT/E-PUBLISHING : The issuing of content to the public for free or for sale in any electronic medium (text, audio, video, mixed), authored (or in some way edited and reissued -- this is the case with SECONDARY PUBLISHING) by a USER, without going through the editorial process. This means that there is no editor, as opposed to traditional publishing, and there is no peer-review process. In order to establish authority for the published material, it is necessary to find alternative methods of building TRUST, from the reputation of the USER to the use of various rating/ranking/linking and other trust-building tools.



This morning, August 8, 2007, at 6:00 EST, Geoffrey called me from England. I had emailed him an outline with my questions. I recorded the interview and what follows is a faithful transcription of the conversation, minimally edited for fluidity.

Turtle = T
Geoffrey = G

G: I've been in California for the past week at a conference.

T: How was it?

G: It was very good.

T: What was the topic of the conference?

G: It was a conference sponsored by O'Reilly Media and it's called Sci Foo. They hold it once a year and they get a bunch of people who are doing interesting developments in the sciences together from all over the place; sort of cross-disciplinarian stuff. And they just sit around and talk about the future and what they think is coming down the road and what interesting stuff is happening. It's very loose but hugely interesting.

T: They're the ones that make the animal books, the Safari books, O'Reilly?

G: Yes. That's right. Tim O'Reilly runs a bunch of conferences and a lot of technical conferences as well. And he does these other ones which are more about social developments in science, and are just a little more general. So this was one of his broader-themed conferences. But he's an interesting -- and I'm sure you've heard a lot about him during your coursework -- his publishing organization is an interesting one to examine, because -- Well, let's just start from the fact that his books are probably some of the only books that you'll ever see that are shelved by publisher. He has that strong a brand presence. And that's almost unheard of in the industry. If you went into a bookstore that was shelved by publisher, you'd go out of your mind, in general, trying to find stuff.
But he's the exception, so it's interesting to try and figure out how he's accomplished that. Then the other thing that he's done is that he's been a real pioneer in two areas: one is electronic publishing, so his Safari book service has been doing online books now for upwards of six years. Clearly it's successful and it hasn't cannibalized his print sales. And it seems odd particularly in my industry, which is scholarly professional, how reluctant they've been to get into developing electronic books. They've just been so slow and they should probably look at him for a model.
The other thing that is interesting about his outfit is that he has really managed to tightly couple his conference business with his publishing business, which is another thing which I think my industry has been trying to do for a while with varying degrees of success. But he has really created quite a tight connection, I think, between his publishing efforts and his print efforts; a very symbiotic relationship. So anyway, he's just an interesting character to look at when you're looking at our industry, when you're looking at the broader publishing industry. Anyway, this conference was held over the weekend. [...]

T: In your talk you said, "we must tell researchers what to look for." This was at the conference, and the "we" that you refer to, I assume, includes publishers and librarians. In the bio that they gave us for you it said that prior to CrossRef you had spent a number of years as a consultant to publishers and librarians to further the use of technology as a knowledge development tool. So, how do you see librarians in particular reaching out to researchers, in practical terms? What do you think would be the best way for librarians to start reaching out and advertising these new technologies and the ways in which they can be used effectively?

G: Let's start by saying that there are a number of places where what librarians do and what publishers do overlap. John Unsworth, who is the dean of the School of Information Science at UIUC, once gave a talk in which he talked about "lublishers and pubrarians," and how a lot of the things they do are very similar and likely to become more similar. And I think that's true with both publishers and librarians, as their roles start to change and they start to strip away what's clearly not as important in the electronic age, one of the things that you're going to find they both share is this role of identifiers of trustworthy, authoritative content.
This is something that they have both done, one as a pre-filter and the other as a post-filter, and they're probably going to both start converging and possibly even treading on each other's toes in trying to provide these services. Then the question was that we have to try and help researchers find stuff that's relevant, and both have done that, have played that role. And the reason that I think it's going to become an even more important role is that you are going to have -- you already have such an explosion of content that's out there. And that's a wonderful thing, and it's particularly wonderful if you're seeking entertainment, because you've got lots of other people out there who are identifying entertaining information or content. So things like a lot of these social recommender systems are fantastic in these environments. But they also have to be adapted for helping people to identify reliable and trustworthy content, because of course it's basic information theory: the more stuff you have out there, the harder it's going to be for you to find stuff that's relevant. And ironically, almost every trend that is benefiting researchers in their role as authors is making their lives as readers more difficult. The easier it is for them to reach a wider audience, to put stuff out there, to publish data, to publish working papers, to publish multiple versions of papers -- all of that makes their lives as a reader harder, because it means that it's harder to determine what's trustworthy; it's harder to determine what different versions of things are and how they relate to each other. And time is, from the researcher's point of view as a reader, of the essence, and there are studies that show that researchers are reading more things and spending less time reading each thing. And ideally they'd like to spend even less time reading. They would really love it if you could provide them with a tool that really did help them to only identify the stuff that was truly relevant and important, that would be a huge benefit. Because at the moment they spend an awful lot of time trolling through dross and filtering out stuff themselves. So this is a place where I think both publishers and librarians can play a role in helping them.
So what can they do? Well, one of the things is that they can use the very same tools that people are using for social bookmarking to do things like create annotated bibliographies for particular disciplines. For instance, the recommender systems can help to contextualize things that are out there. One of the problems that you have and that frustrates researchers in their role as readers is that in their role as authors they might publish five very closely related papers. As a reader, if you go out and find these five papers, it might not be immediately clear what the relationship is between these five papers. Does paper A expand on paper B? Is it a refinement or is it a correction? All of these things you don't know. So you end up with five papers that have a lot of seemingly overlapping content and how the heck do you determine what they are? You spend an awful lot of time doing this kind of work. So, again, if librarians and publishers were able to help researchers do some of that or understand what the context is, that would help researchers in their capacity as readers.
So there are all sorts of things that librarians and publishers can do, but the truth is I don't know. But the truth is that they have to experiment; they have to try different things. And I think librarians have probably been in a better position because they are more in contact with researchers in their capacity as consumers of information. Publishers in contrast, historically, have been removed. First they were removed because they worked through agents. Now they're removed because at least they're working through librarians, but they're still not generally talking to the researcher as consumer of information as much as they are talking to the researcher as producer of information. So publishers have got to do a lot more, probably, to understand what the challenges are of the researcher as reader.

T: The second question, as you can see, is about the "My Brain" button. I am really enamored with this idea. I think it's wonderful and I'd like to know more about the idea of the My Brain button. Was it your idea? Did you get it from somewhere else? Have you thought of setting up a kind of repository of brains where people could look for kindred spirits, as it were? In the sense of people who are interested in the same things they're interested in and what they're looking at, what they're reading? I'm still thinking of this as an aid in disseminating trustworthy material. I found myself repeatedly thinking it would be the ultimate dating service, this collection of brains. If you could find a brain that matched your own that would be your soul mate. This is more anecdotal thinking, of course. But tell me more about it.

G: I've encountered a lot of people who have discussed some of the same problems that led to my phrasing it this way. But the phraseology came to me when I was trying to explain to a researcher colleague of mine why I thought that the ability to subscribe to the RSS feed of somebody's bookmarks, the RSS feed from their blog, the RSS feed from their Wiki, the RSS feed from their calendar -- why I thought that that was interesting and powerful -- and for some reason at the time I said, it effectively lets me subscribe to their brain. And then he got it. All of a sudden he understood what it was that I had been tortuously trying to explain. So I tried the phrase out on a few other people and it seemed to resonate.
Another phrase that I've heard people use more recently for the same concept is "lifestreaming." You create all sorts of streams of information from what you're doing; so it might be your pictures from Flickr, it might be the music you're listening to; it might be that you're using Twitter to update people on your mood or the fact that you're getting up to get a cup of coffee or whatever. But the fact that all of a sudden you can almost dump everything that you're doing into some digital form that other people can consume. And, as I said, the phrase lifestreaming seems to be popular at the moment.
Personally, at least in this industry, I prefer brain subscription because when you think about a lot of what researchers, again, as consumers of information are concerned about, they want to know what their colleagues are doing; they want to know what the state is in their industry; they want to know what their research group is doing. And in meatspace there are a lot of barriers to sharing information, not the least of which is physical proximity. So these tools present people with the ability to share information in near real time about what it is that they're discovering, what it is that they find interesting, what directions they're taking. And you can imagine that if you are a researcher and you're collaborating with people all around the world and this research group started in labs everywhere -- it would be a tremendously useful -- it would hugely reduce the friction of collaborating, because you could see exactly what people were doing at any given time, and understand what was going on.
So I think that this is a very powerful way of doing things. I think that you're beginning to see -- again, let me get back to the use of social tools. One of the things that usually frustrates me, when generally people show tools like social bookmarking tools or recommender systems or blogs or something like that, they always show sort of the top level, the most popular stuff. If they show you del.icio.us, they say, look at this, you can see what everybody is bookmarking and what everybody finds interesting. And if they show you a recommender system like Digg they say, see, you can see what people are voting on and what the most popular stories are. And if they show you blogs they show you the most popular blogs, that are inevitably blogs about gadgets or news or something like that. So a researcher looks at something like that and goes, well, that's all very entertaining, but what the hell has this got to do with me, why should I care? And the answer, I think, is that they've shown us the most naive use of those tools. The truth is that in order to use them effectively, you subscribe to the blogs of people you already respect or who are interested in what you are doing. You only look at the bookmarks on del.icio.us of the people that you're interested in and that you think have something relevant to say. You only subscribe to the Flickr photos of people that you care about. So you immediately narrow it down.

T: And my question to you, then, is, how would you go about finding people whose brains you are interested in? In other words, they're not necessarily just people you know. As you said, the research community is large and many researchers who are working on the same things might be separated by oceans and continents. So have you thought of setting up a kind of repository of brains?

G: There are repositories out there, and they're kind of scattered and I think that you can make use of them. The brain subscription button was an approach to this. I generally don't like centralized -- I think there's a lot of evidence that centralized approaches don't work in these things. So the whole idea of creating a centralized brain repository would probably backfire, because people would say, I don't want to have to use your tool. I want my own tool. You use del.icio.us but I use Furl; or you use CiteULike but I use Connotea. You're immediately going to start falling into these problems with people who have different preferences for tools. So the idea behind the brain subscription was, all right, what if there was a way where you had a format where you could record where it was that you had various streams of information about yourself being collected? So I could create a little embeddable format that I could put on a webpage, that would say, if you want to know what I'm doing, this is where I do my bookmarks, this is where I do my blog; this is where I have a Wiki. And if you made that format machine readable, then you could build a whole bunch of tools that would allow you to go out and harvest this information, and different people could build different tools to harvest it and make use of it.
You could in theory say, I want to build a tool that goes out and finds everybody who has a brain button and who has in their del.icio.us bookmarks a category on Publishing 2.0. And that might be a good way of finding a bunch of people. So that was the idea behind the brain button. But there are other mechanisms, obviously, that are just more akin to techniques that we use already.
One of the things is that generally you follow a path. You meet somebody, you say, all right, I've met this guy Danny Ayers, he writes a blog, I'll read his blog, it's very interesting. He cites these other people a lot, and I look at their blogs because he cites them a lot, and you know what, I think they're interesting too. I don't know them, but I'm going to subscribe to their blogs. And you know what, they cite other people. And then you start realizing that four of them are always referring to this other person, and that person suddenly you realize that they're probably a pretty big authority here... So you do a lot of the same things that you do when you're looking at a journal or a book, and when you're doing background research you get a sense of what the social network is and who is in authority in this area. It would be wonderful if things like the brain button would allow you to automate it a little bit. But some of the tools are already out there.
Let me address the last issue there, about the dating. Interestingly, if you look at something like Nature Network and if you read Charkin, he'll say stuff like, funnily enough, scientists have social lives too. So a lot of the social networking application stuff, if you look at Nature Network and some of the stuff they're doing, they're really thinking about the scientist as not just a scientist but as a person, a person who has to rent an apartment, who has to try and figure out what's going on in a city. So they're definitely combining this notion that professional and social might overlap. My observation is that I know a lot of people who share my interests professionally that I certainly wouldn't want to date. I'm sure it goes the other way as well. But I think that's a sensible way to go about it, to a certain degree. But Nature certainly is pursuing that line.

T: That's very interesting. This brings us to the third question, which you've already answered in part. I'm curious to know, would you call this "publishing" your brain? The creation of a "my brain" button, do you see that as publishing your brain to the world?

G: Yes, it's providing people with a place with all of your feeds that you're generating on what you're doing. So, to use the other phrase, it's sort of a collection of -- somebody else might call it a lifestream button, or something like that.

T: So you feel that rather than having a centralized repository, the electronic word of mouth, as it were, is a better way of disseminating this information.

G: Yes, we have a lot of tools that are very good at going out to websites and consuming information, harvesting information and pulling it together. So you don't need a centralized a place where everybody puts their RSS feeds. You have one place where you read RSS feeds, but the RSS feeds are coming from all over the place. You don't need one place where all people's brain files are stored. They can just be stored on their own website and then you can have things go out and harvest them, or index them with search engines and things like that.

T: Let me ask you a technical question. I'm also interested in understanding a little bit more how the technological things work. Currently, there is not an RSS reader that could decode the OPML and display it in a more readable format?

G: That's actually a problem with browsers. The brain file is basically just an OPML file, Outline Processor Markup Language. It's just a machine readable format that points you at different locations. The problem is that when you click on that at a moment you get a horrible mess.

T: That looks like an XML file.

G: It is an XML file. I don't know whether you remember this, but even a year or so ago, if you went into your web browser and clicked on an RSS button, you would also get a bunch of horrible XML, because the browser didn't know what to do with it. So in recent browsers now -- I'm talking about the latest versions of Firefox and Internet Explorer -- if you go and you see an RSS button and you click on it, all of a sudden the browser will say, ah, I know what this is. I'm not just going to show them this mass of XML; I'm going to offer them a choice of subscribing to this RSS feed via whatever their favorite RSS reader is, whether it's Google Reader or Bloglines or whatever.
So they modified browsers to deal with this more intelligently. Ultimately if something like the brain subscription were to take off, you'd want some sort of mechanism whereby if you click on an OPML file it would recognize it and it would say, okay, fine, I will import this OPML file into whatever your reader is. If you want to use the brain button at the moment, what you have to do is save that XML file onto your hard drive, and then go over to Google Reader or Bloglines, and import it. And it will import. They both support OPML. You can automate it. It's the browser that doesn't do it automatically. You have to do the manual step.

T: And if I did import it, what would then happen, would I just have a collection of links?

G: Yes, you would have a collection of RSS feeds in your RSS reader, and one of them would point to my del.icio.us bookmarks, and one of them would point to my blog, and the other one would point to perhaps my LastFM account, or something like that.

T: Let's move on to the next question, then. Thinking about these brain buttons as trust metric tools, I'm reminded also of the fact that you called links votes, which of course is self-explanatory. But how would you tie in these votes with the brain subscriptions, in other words to apply trust metrics to brain subscriptions? Meaning: I'm looking for things that I trust, that I consider to be trustworthy, I'm getting these different feeds from different people, I want to be sure that I'm not misled.
Since you suggested that I look for things that talk about trust metrics, I have been, and I've been reading about people who intentionally -- as usual -- abuse tools to create chaos rather than being helpful.

G: Okay, let's start with the linking as votes. Yes, a link right now is treated pretty much as a vote by things like search engines. Google, for instance, its PageRank is treating a link as a vote. But the other thing I said is that's really a very naive thing to assume. Because you will often link to things to say, look at this, it's a stupid as dirt. And we do this with citations as well. When you cite something, you might cite it because it supports what you're saying; you might cite something because you're arguing against it. You might cite something as background material; you might cite something as a counter example. There are all sorts of reasons that you might cite something. Ultimately I think that people are going to want to be able to add some sort of semantic hint to any link, so that they can differentiate between these kinds of links or citations. And you already see that to a certain extent with the attempts to deal with blog spam that search engines came up with. They said, okay, there are web links where people allow you to put an attribute with a value of "no follow" on it. And if we see a link that has an attribute with a value "no follow" on it, we're not going to count that link as a vote. So now, for instance, blogging software will automatically put a "no follow" attribute on any links that are included in comments. So people who were using comments to deliver spam, with links back to the sites that they wanted people to go to, they can't use this mechanism anymore, because search engines don't care if there are links in comments because they don't treat them as votes anymore.
Now, of course that could be refined quite extensively, if you had more sophisticated ways of indicating the relative importance of the link. At the moment it's pretty much all or nothing. So this issue with trust metrics, and how it could be applied to brain subscriptions. If you subscribe to my brain and one of the elements of my brain is my bookmarks or perhaps the blogs that I follow, you see that there's already a way that you might be able to traverse that, and say, okay, Geoffrey reads Lee Dodds; Lee Dodds reads Danny Ayers; Danny Ayers reads Clay Shirky, and all of them seem to read Jon Udell. Therefore, I'm thinking that since I read three of these people and they all read Jon Udell, Jon Udell might actually be somebody that I should be paying attention to and who I consider to be trustworthy.
There are two problems that you often see in trust metrics, and I'm sure you've already seen this, and one is: to what degree does transitivity work? How far should it go? And the other is context. I might trust Lee Dodds on anything having to do with technology and publishing, but I certainly don't trust his taste in clothing, or as far as music or anything like that. So you also need an ability to create some sort of a context for whatever trust metric you're using. And that's an important element of any trust metric that's going to be successful.
And then the last thing you brought up is this issue of people trying to game trust metrics. Well, that's no different than the analog world. People "salami slice publish" now. They take something that could easily be written up in one paper, and they split it into five papers, because this somehow gets them better citation counts and stuff like that. People do this kind of stuff all the time. In computer-based trust metrics, the interesting thing about them, and the challenge and the reason that is is very hard is people trying to develop techniques to make them self-balancing, to make it very hard for people to game the system. That's the challenge in creating an electronic trust metric.
I don't know whether anybody will ever -- if somebody is ever able to create a completely self-balancing trust metric that's calculated, that would be phenomenal. I somehow doubt that that will happen. I think that there will probably always be elements of people having to go in and hand tweak them and monitor and administrate them. But we'll see how that works.

T: We can skip the next two questions, since you've already answered them. I had been reading about "attack resistance" and you have just been answering that. So let's move on to question 8 then.
In your podcast with Jon Udell you touch upon the things I'm most interested in exploring: that user-generated e-publishing and scholarly communication are becoming closer and intertwined; that there is original material like blogging and secondary material like social bookmarking, a form of secondary publishing as we have already talked about in our correspondence. As a close to our interview, let's talk a little bit about this: how can we as librarians create e-published material that has legitimacy -- how do we go about implementing trust metrics so that the resulting reliability will be visible to the readers? How do we get the word out there? How do we show that our brains, where we aggregate our output -- original, primary and secondary -- are trustworthy?

G: That's a big one. Let's start with the intersection between user-generated e-publishing and scholarly communication. I think at the beginning of this you defined user-generated content as being people generating content outside of their primary -- in their personal capacity as opposed to their professional capacity.

T: Within their professional interests, but not as official spokespeople -- because that obviously is regular scholarly publishing; in other words journals or the proceedings of conferences and things like that. But this would be something more akin to blogs, or things like that. That are still being used as methods of disseminating their scholarly research, but maybe it's prepublication, maybe it's in the course of their research they're divulging some of the things that they're discovering and so forth.

G: I don't think that people generally try to define -- they should try to define what they mean by user-generated content. My guess is that if you ask most people what their definition of it was, you would get something along the lines of this: that in a traditional media company like NBC, television or radio, you have a group of people who are professionally paid to create content and send it out to other people who consume it. The audience for the content and the people producing it are different.
Whereas with user-generated content, the audience is probably generating as much as anyone. So the issue of professionalism or the affiliation of the thing I think can probably be separated from that. As an example, the one thing that I like to point out is that unlike a traditional publisher where they have a small group of professional writers who they identify and then they help them to disseminate their content to as wide an audience as possible, from the publisher's point of view, they could never make a living if they only sold to the people who are writing the content. If their entire audience for novels was novelists, that would be a problem for them.
Now, contrast this with scholarly publishing, where their entire audience for research or research papers is also producing research papers. That's a big difference. So what I would say is that scholarly publishing has been in the business of user-generated content forever; whereas other media industries have not. We have always had this bizarre situation where our reading audience is also our producing audience. Almost, not quite, because there are an awful lot of faculty members who no longer do research, they just teach. But by and large, you've got a far higher percentage of your audience also being people who produce content.
There's a classic system that's used for analyzing the competitiveness of any particular industry, and it's a guy named Michael Porter who I think teaches at Harvard Business School. He has this concept called the Five Forces Analysis, where if you look at an industry and you look at these -- the five forces in an industry being the bargaining power of suppliers, the bargaining power of customers, threat of new entrants and threat of substitute products and then the competitive rivalry within the industry -- if you actually look at that and you look at the scholarly publishing industry, you realize that the suppliers, substitute products, new entrants and power of customers, all of these people are the same people in the scholarly industry. Any faculty member can go out there and say, you know what, I want to create a new journal, or we're going to try and create a substitute product, we're going to create this open access archive. They are also the suppliers of the content that the publishers are publishing. Effectively you've taken any traditional industry's pretty distinct entities and you've mashed them all together. It's no wonder it's kind of hard to figure out this industry. It's very strange in some ways.
So I think our industry, scholarly and professional publishing, has always been in the user-generated content business. Now the issues I think I was talking to Jon about, which I think is interesting, is that we've always -- and this has been a constraint of physical printing -- and that is that we only wanted to invest the money in disseminating the thing that had the highest level of authority. Because you had to print this up, because you had to mail it out to all these places, you wanted to make sure that whatever you were printing was super-super highly reliable, that it had gone through an amazing process of quality control. Because it was very difficult to retract it, it was very difficult to correct it, it was very time-consuming and expensive to do all of those things.
Now, in the electronic world that changes a bit. All of a sudden if something is wrong it can be corrected pretty quickly, it can be clarified pretty quickly, and it's not that expensive to disseminate. So a lot of the rationale behind original dissemination strategies has probably disappeared, but I don't think our industry has quite adapted yet. Researchers haven't adapted yet either, so it's not just publishers and librarians. If it doesn't cost you that much to put out an idea that isn't completely formed, but that idea still might be useful to other people, then put it out. That's fine. But what I think needs to be done then is that we have to start thinking about different gradations of trustworthiness of the content that we're putting out there.
So some scientist's musings on their blog should probably be treated differently from a working paper, and that in turn should be treated differently from a paper that's been submitted to a journal, and that in turn should be treated differently from a paper that's been accepted and published by a journal, because each goes through a different layer of authority checking, trustworthiness checking.
And then likewise, even after that, an article that's been published by a journal should probably be treated differently from an article that has been published by a journal and that has been extensively commented on by lots of other people publishing. Either commented on by other articles through citations, or commented on by other scientists through less formal means like blogs or Wikis.
So I think one of the big things that the publishing industry has to do is to figure out, all right, there is demand for these different levels of trustworthy information, how are we going to supply it and make it clear what the relationship is between them, so that we don't treat everything as having the same degree of trustworthiness. Yet we don't stymie communication because we don't want to put something out there until it's absolutely been through every process imaginable.
So how can librarians and publishers create stuff that has legitimacy? A lot of it is just building up reputation. Librarians and publishers -- we're in a bit of a circular argument. Trust has a time component to it. Somebody who has been trustworthy for ten minutes -- If you have two people, one of whom you've known for twenty years and they've been trustworthy that twenty years and you compare them to somebody you've known for three days and they've been trustworthy for three days, you probably have more trust in the person who's been around for twenty years than you do in the person you've known for three days. So one of the things that I think that librarians and publishers have to make clearer is their track record in trustworthiness. And they have to be very careful about making sure that they build a very solid track record and once they've done that they've got to advertise it. They've got to think of some sort of mechanism for making the distinction between a publisher who's been around publishing pretty much reliable information for hundreds of years and somebody who's just jumped into the game and is putting stuff up on their website.
They also have to create other metrics that allow people to evaluate the relative trustworthiness of content. Even right now I think that librarians and publishers -- some librarians and some publishers -- have a degree of trustworthiness that they could exploit. I think that for instance librarians creating -- or even publishers -- saying, all right, so you don't know what to trust in science blogs out there. We're going to create some guides, we're going to create you some tools that allow you to identify what we deem to be trustworthy blogs. Or we're going to have people review Wikipedia articles, and then perhaps the Institute of Physics or something like that will say, these are Wikipedia article entries that we've actually checked and that we think are pretty good. All of these kinds of things would help, all of these things are useful trust metrics that I think both librarians and publishers could start providing. And they will help people focus, when they're looking at websites, say, oh, actually this website has gotten some sort of little semi-endorsement from the Institute of Physics, so I'll treat it with a little more respect than I would some of the others, or something like that. I think there are a lot of things that they can do.

T: We're pretty much winding down now, and I really can't thank you enough for your kindness and generosity. But I'm interested also a little more in the practical aspects of blogging, for instance, as a means of e-publishing, since I've been involved in this experience. I had never done a blog before this one. Of course if I'd known what I was getting myself into I never would have done it; but at the same time I'm so glad I did, at the same time. Now that I have done it, it's given me great satisfaction and it's also given me insight into -- for instance the first few days I was putting tags in, and then afterwards it became such a monumental effort that I stopped putting tags in. Now that I'm finished I do want to go back and add tags, so that people can search through my blog.
I'm also interested in exploring things like, at what point do you think it's acceptable for people to make money with ads off of their blogs. I'm thinking about these things myself. I'm asking myself that question. Once my blog started getting some hits, fellow students, professors, family and friends, then I got the popup thing from Google that said, make your blog make money for you. At first I thought, my goodness, they have to turn everything into a commercial venture. But then after a while I was putting so much work into it that I thought, gee, maybe I should be making money off of this, because it's just so much work. But all these considerations are my reflections on what this experience has been like for me. However, I have also come across some technological challenges which I find important to think about and talk about, and they tie in with what you're doing at CrossRef. My own professor, because it became clear quite quickly that my blog was going to be kind of a one-stop-shopping resource for my fellow students in the class for deciding what to do their papers on, because they could go there and see links to every single lecture we've had in the two weeks leading up to the conference. All the speakers gave their PowerPoints and their PDFs to Andy Dawson and he put them on the UCL website and I linked to them in my blog. So if somebody just read my blog they would get every possible link that was in the school, with the addition of links to everything that was mentioned. So every company, every company website, and even concepts, people, and so forth.
I'm very interested in DOI's because I can't quite understand what it is about DOI's that makes them persistent. Does that mean, to create a DOI, you yourself have to have a place where you permanently keep the things that are being linked to, the papers or whatever, that is going to permanently reside in one place so their URL never changes? Or how is that actually implemented?

G: Let me start with the amount of work needed to do a blog, because I think you've done something interesting which I think a lot of people do, myself included, when they start a blog. And it's inevitably the way to stop blogging. And that is that what we do is we have great ideas about big things we want to blog about, and they're too big. It starts becoming a real writing project. And then it becomes so much work that we abandon it. I think a lot of us are not used to the notion that -- The most successful bloggers that you see out there, I think are really good at -- they're far more comfortable putting out half-thought-out ideas in a very informal style.

T: And of course I couldn't bring myself to push the "publish" button until I had read it over a hundred times.

G: Exactly. And that's a cultural difference, and one that's a really hard thing for people who are used to writing in that way to get over. I keep trying to force myself, every time I think of writing something for the blog, I think, I've got this long thing I want to talk about. And the truth is that if I just broke it down into lots of short little entries, and if I stopped obsessing about the wording and phraseology and all of that stuff, I'd be able to post. And the people who I know who have really gotten over that and have adopted a far less formal style and are far happier just posting short things and then linking them together later, they turn out to be the most successful bloggers. So my advice, and it's advice that I wish I followed myself, would be: get less formal, post shorter things.

T: Of course I give myself this advice every day. It's just hard to do it.

G: Then there's the DOI question. DOI's are actually pretty easy to deal with, and unfortunately there isn't much technology magic behind it. Let's just start with the problem, which is that web links break. Linkrot is a huge problem. And even when the web first started out, we all realized that linkrot was a big problem. The simple reason for this is that the strength of the web is also its weakness. It's totally distributed. One web server doesn't have to know of the existence of another web server. If you host a web server and I point to it, your web server doesn't have to know that. It doesn't have to approve my creating a link to your server, it doesn't have to do any of that stuff. That's really powerful and it has all sorts of scalability aspects to it that have contributed to its success. The problem with it is that is also means that if I link to something on your site and you move the thing that I linked to, the link will break. Or if you decide to change sites, the link will break. And there's no way for you to know that I'm linking to your content so that you can inform me, you know what, I'm moving this stuff, so you've got to update all your information.
So this is the fundamental structural problem of the web, and publishers recognized very early on that this was going to be a problem, particularly for them, because if they wanted to create an electronic environment that included electronic citations that would allow you to follow the link to the source material, they didn't want that stuff breaking, because citations are the building blocks of scholarship.
So they thought about this and they realized that really the only mechanism that they could build was to create an organization where publishers who were serious about maintaining citation links could join, and in joining this organization they promised to do some things. They effectively are saying, we will adhere to certain terms: we will submit unique identifiers for all of our content, and when people use these unique identifiers they will be able to locate our content, no matter where it is. But there is no real technical magic behind it.

T: Does that mean that the DOI is actually a miniature searcher, that searches for it wherever it is?

G: No, it's not a searcher: it's a pointer. All it is, is a pointer. The concept of a pointer is -- are you a computer programmer of any sort?

T: No, but I have a basic understanding of some programming concepts.

G: Think of a pointer as, if you have a post office box, that's a pointer. You can say to people send mail to my post office box, and it doesn't matter where you physically live. You can always get your mail but it's going to this post office box instead. The post office box turns into a pointer for you. Anyone can send mail to that post office box, they don't have to know where you physically live. They'll know you'll get the mail. A DOI is a very similar concept. We're saying, when you cite something, don't cite the location of the thing, cite this number instead. And this number, or this string -- it's not really a number -- this identifier, when you cite this identifier, what we will do is we will go look up the most recent physical location of that place, and then take you there.

T: And is that string embedded in the object?

G: The DOI, that string you see, is just an identifier. You click on that and it passes that identifier to a website that looks at the identifier and says, okay, someone is trying to link to this, where does it live now? And it returns the URI or the place where that content lives currently.

T: And to discover where it lives currently, is the string also embedded in the object itself? I mean, how do you find it?

G: No, it's not. So all you're doing, you see the DOI. The DOI is just an identifier that can be assigned by the publisher. What CrossRef keeps is a huge database that maps a DOI to a URI, and if the URI changes it's up to the publisher to tell us that the URI has changed, and then they can update the URI in our table and we'll continue to find the content.
What this means from a practical point of view is, let's say you go out and you cite three articles that are published in Wiley-Blackwell journals. Then Wiley-Blackwell decides to sell two of those journals and then they change where they're posting the third journal. What they will do is they will send updated information about where those DOI's point to, to us at CrossRef. And you don't have to worry about a thing, because you've cited the DOI's instead of the URI's. So when somebody clicks on those DOI's they'll come to CrossRef and say, okay, now where are these located, because they're not at Wiley-Blackwell anymore. And we'll tell them where they're located now and we'll resolve to where they're located now.
The distinctly un-exotic bit about it is that persistence is not -- we don't have some magic technical solution to persistence. Persistence is a social construct. We are a membership organization and in joining our members are agreeing to adhere to certain principles, one of which is that they will always update where things are so they will always update where DOI's currently point. If they don't do that, we have ways we can find them, we can do all sorts of stuff to try and get them to adhere to the principles behind CrossRef. So it's very much just an organizational mechanism for persisting citation links. And the problem is, a lot of people thing, well isn't there a technical solution for this? And the answer is, there isn't a technical solution with an architecture like the Web. If we had a hypertext system where everything was centralized and every document knew about every other document, then you could created a technical means for making sure that links never broke. So if you read early hypertext pioneers like Ted Nelson, who had this concept, his Xanadu project and all of these things -- These are early hypertext systems where everything was controlled fairly centrally, and therefore they could do things like make sure links never broke, and make sure links were always bidirectional and not unidirectional. The Web architecture doesn't support that easily, so we had to create a social construct that allows us to preserve persistent citation links.
So it is abstruse. But the simple way to put it is that we fight linkrot. And we make sure that citation links, which are very important, don't break. And that's one thing we're doing. We're going to be branching out and providing other kinds of services like that.

T: Well, that's wonderful. I think it's a very valuable service. And it also ties in with the concept of trust, because if somebody goes to your website looking for content, and clicks on the links and they don't go anyplace, that erodes their trust instantaneously. Even if it's not something as crucial as following a citation, even if it's more banal or mundane, still, when you click on links and they don't go anyplace, that immediately lowers the degree of respect you have for whatever resource you're using.

G: And this is the root of the conversation that I had with Jon Udell, where he's saying that for a long time, really, only scholarly authors were really concerned about citations. Now all of a sudden bloggers everywhere are concerned about citations. Jon Udell, part of his professional life is his blog, and if he moves from one organization to another -- for example he recently moved to Microsoft -- and he wants to take his content with him, his URI is going to change and all the links to his content are going to break. And that's not acceptable. And in your case, you're blogging at turtleinlondon.blogspot.com. You started that website and named it that largely because it started off because you were going on this course in London and you wanted to blog about it. All of this material that you've recorded here might have a more permanent value, and you might decide that you want to move off of blogspot, or you might decide that you want to make this part of a more general site on the publishing industry. So you might start another website and you might want to move all of this content there. As soon as you do that, anybody who's linked to the content, all of those links are going to break. So Jon is interested in trying to figure out whether there's a way that the concept behind CrossRef can be extended into the wider web, for people who are concerned about links to their content not breaking.

T: Not to mention the fact that I have no control over what the links within my blog do. In other words, if I link to your paper and then you move your paper, how am I going to know that.

G: Absolutely. And I agree with him. I think that this issue of persistence of links is really big, and we have to start thinking about how we can provide mechanisms for people to ensure it. The problem, again, is that there is no technology magic that can be applied to it. So it's going to probably require an organization like CrossRef providing a similar service for a wider audience. And immediately you get into some difficult questions there. For instance, right now we're a membership organization. If publisher X joins CrossRef and then doesn't update their URI's, as I said, we can find them and we can sort of enforce norms of behavior once they've joined our organization. But how do you do that if you have millions of individuals? You can't enforce the same norms of behavior, so you probably have to create a different kind of mechanism.

T: Well, I have one final question, which you can choose to cut very short if you like. When you were at the conference, you talked about vertical and horizontal trust. And what we've been discussing this morning, in the questions of how do we spread the word, how do we get things out, a lot of the things you've described sounded to me, and correct me if I'm wrong, horizontal. In other words from one to another to another, they have been translating laterally from one person to another. Do you think that in these more "informal" technologies, as you said to Jon Udell, like blogs and Wikis and so forth, there is any room for a more vertical structure, or do you think that the horizontal structure of electronic word of mouth, as it were, is the best way to disseminate this information? Please tell me your thoughts on that.

G: The horizontal / vertical, global / local axis is this trust model that I first read about in the book Trust from Socrates to Spin by Kieron O'Hara. The short answer is that I don't think that horizontal/local trust works. It just doesn't work. We have so much evidence of it. We have spam, we have phishing, we have people stealing people's content.

T: It's too vulnerable to attack, you're saying.

G: It's too vulnerable. On the other hand, I also don't think that vertical/global will work anymore, particularly not on a distributed structure like the Internet. So the short answer, I guess, is that I think that the promise of a lot of the social tools that we see, and particularly the promise of trust metrics, is that we might be able to create a mechanism that mitigates, that allows you to transcend the dichotomy between local and global and vertical and horizontal. And say, you know what, we can overcome the limits of these in some way.
For instance, again, Kieron O'Hara, when he talks about local trust he talks about trust that's established through some sort of personal knowledge. So I trust this person because they're my friend or my neighbor, they're related to me. That kind of trust, in the analog world, in meatspace, doesn't scale very well. It has zero geographic scalability. But all of a sudden, with the Internet, I can, through long acquaintance with someone online, develop a trust profile of them. It would have been very, very difficult for me -- I could have done it through letter writing or some other means in the old days, but now all of a sudden my local trust network ironically is no longer geographically constrained as it was in the old days.
So that's one example of how the Internet can allow you to overcome this obstacle.
So local trust, which is trust that has a transitive quality; that is, I trust an auditing company and therefore I trust anybody that they audit, that can be -- the dangers, the intrinsic, systemic risks of failure in that can also be mitigated using social networking tools. So I think that there are a lot of interesting developments out there that promise to breach the divide between Internet trust, which is local and horizontal, and scholarly trust which is vertical and global.

= = = = = # # # # # = = = = =

Thus ends my interview with Geoffrey Bilder. He is an amazing person. Very generous with his time and expertise, and truly passionate about what he does.

I would like to close with a few of my personal thoughts on this experience of blogging for the very first time. I really would like to close with some questions, which are almost always more important and more interesting than the answers.

1. If I had known what I was getting into, I never would have embarked on this adventure.

2. Since I didn't know what I was getting into, I did. Once I realized how hard it was going to be, it was too late, and I had no choice but to forge ahead. Having said that, I'm very glad that I could not foresee the vastness of this project, because it has been one of the most gratifying experiences as a semi-professional writer that I have ever had.

3. I agree wholeheartedly with Geoffrey about blogging: if you want to be successful, you have to feel comfortable publishing incomplete thoughts, poorly phrased sentences, with a few typos here and there. If I ever start another blog, I will try very hard to take his and my own advice on this topic.

4. My interests over the course of my degree program have gradually become focused on cataloging and metadata, and yet the first "casualty" of my blog were the metadata tags, the very objects I should be focusing on the most. If someone on the Web were to search blogs that talk about e-publishing and all things related to it, without the tags they might not find my blog. The metadata, as we've heard many times before, can make all the difference between discovery and invisibility.

5. It is very interesting, and important I think, to highlight something that Geoffrey said in the course of this interview: a) in the business of scholarly publishing, the readers and the writers are all part of the same community, unlike almost all other types of publishing in which a small number of authors writes for a large audience of non-writers. b) Because the writers and the readers are one and the same, they are all engaged in user-generated content generation (whether print or e-).
6. What this means is that these methods of building trust become crucial in the context of online scholarly communication, and with care and attention to maintenance of strong reputation both publishers and librarians can give valuable and strong contributions to the scholarly community in the difficult process of disseminating scholarship and untangling the mass of output, which may vary from informal ramblings to peer-reviewed, published articles and monographs.

7. The open questions that remain are those of translating into practical measures the guidelines that have emerged from this interview:

  • how to create trust metrics that will give weight and authority to the words of librarians and publishers?
  • which tools are most suited to create guides and "maps" for scholars and students to wade through the volume of material that is available?

8. It is very encouraging and exciting, however, to find that there is widespread agreement that there is great value in using social software like bookmarking tools to sort through scholarly output and make distinctions between various versions of papers and articles, which can save vast amounts of time for those who have to read them. This means that there is great potential for growth in the librarian community for people to perform this type of task. The librarian of the 21st century can become a meta-librarian and continue to uphold the old values of the profession, and shepherd them, as it were, into the future.

And now, for the last time signing off on this London blog, to all my friends and loved ones, good night!

Literature Survey

Over the course of this summer school experience, the students were supposed to choose a topic for their personal project.

Because of the fact that I created a blog, thereby experiencing e-publishing as an author, my professor, Tula Giannini, suggested that I write something on user-generated e-content. I have embraced this topic, but it was still too broad for a brief survey. What remained was to choose an "angle," a way to frame my study so that it would have some cohesion and succinctness.

Another approach that was strongly recommended was that our projects be as interactive as possible, stemming from our real live experience in London, rather than write a traditional paper based on our reading of secondary sources.

With this premise, I decided to contact Geoffrey Bilder of CrossRef. In his talk on the second day of the conference, Geoffrey addressed many of the issues I am interested in pursuing, even beyond the limits of this class. So I wrote him an email and he very graciously accepted to help me. We corresponded a few times, and he suggested some starting points for my work, and directed me to a few articles, a few books, a few websites, a podcast, etc.

In this survey I will list and link to the readings I have perused, adding some brief comments, and by the end I will have narrowed down my topic so that it is manageable. As I go along I will be forming and collecting my questions. The final part of the project is going to be an extended interview with Geoffrey, which we will conduct over the phone (he will call me from the UK using Skype!!!), and at the end of the interview I will add some of my own thoughts and conclusions.

= = = = = # # # # # = = = = =

When I wrote to Geoffrey I gave him a brief overview of what I was interested in, and in an exchange that spanned a half-dozen e-mails, he expanded on some of the topics he had touched upon in his presentation at the conference, helping me arrive at the crux of what we would discuss in more detail in the interview:

  • Trust
  • Trust metric(s)
  • blogs
  • "My Brain" subscription
  • DOI's and how linkrot erodes authority and trust
  • The role of librarians and publishers through their use of user-generated e-content
  • Attack resistance
  • The intersection/intertwining of user-generated, informal e-publishing and scholarly communication
  • Vertical / horizontal trust in the context of this discussion

Here are some sources that Geoffrey suggested I read as a starting point, with my comments, as well as other sources I found on my own. Geoffrey suggested that I use, as a search term, the expression "trust metric".

An earlier version of Geoffrey's presentation at the Bloomsbury conference can be found at:

The Journal of Electronic Publishing. [JEP]

I enjoyed this article especially, because for one thing it's nice to read a fleshed out article as opposed to a PowerPoint presentation, but also because the concept of local/global trust and horizontal/vertical trust became much clearer to me through this second reading. The best part of this article, however, is that Geoffrey goes into great detail in discussing the various types of "social software":


And he not only discusses the basic principles according to which they work, but he goes into the methods they are adopting to transcend the local/global, horizontal/vertical trust issues. I strongly encourage anyone who is interested in the topics of social software and trust to read this article.

Geoffrey was also interviewed by Jon Udell on IT Conversations, Jon Udell's Interviews with Innovators in a podcast on the topic of Winning the Battle Against Linkrot, in which he talks about CrossRef and how important it is to use DOI's that protect links from breaking if the original location of files is moved. Scholarship relies on citations, and citations are increasingly expressed through links to the articles that are being cited. If you go to read a scholarly article online and attempt to follow a link to its cited article and find that it leads nowhere, no matter how much you try to hold onto your trust in the original article or its author, that trust is somewhat eroded. CroffRef's mission is to keep these links alive no matter what.
In the course of the interview, Geoffrey also revisits the theme of incunabula, an analogy between the time of transition from manuscript books to printed books (Gutenberg, 1500's) and our current transition from print to online. In the time of Gutenberg, people were so wedded to the idea that only manuscript books were "real" and trustworthy, that the first printed books were illuminated by hand to make them look more like manuscripts. These illuminated printed books were called incunabula. Today we are trying to make our online documents look like print, so in a way we are recreating the incunabula of the internet age. The point Geoffrey was trying to make is that it is time to move on into a whole new way of displaying and interacting with our content, and to let go of print in the online environment.

In the course of his talk at the Bloomsbury conference, Geoffrey mentioned a book by Kieron O'Hara, called Trust: From Socrates to Spin. This book is the origin of his discussion about horizontal/vertical, local/global trust axis. To summarize this discussion: horizontal/local trust is trust that is based on acquaintance. I know you, I trust you, I trust the validity of your work/research. If you trust another friend of yours, I might be willing to extend a certain amount of trust to him as well. This kind of trust is horizontal in the sense that it is among peers and cannot therefore be imposed or coerced. It is transitive but only to an extent. I can trust your friend, but I would not be likely to trust a friend of a friend of a friend. It can only extend so far. Vertical/global trust is the kind of trust that exists in the scholarly world, handed down by higher authorities, and can be enforced. If you decide to adhere to a certain school, a certain society, a certain "club", you have to go along with their set of beliefs. If you don't you can be expelled, arrested, kicked out, etc.

Geoffrey's point overall is that the world of scholarly communication could stand to take a lesson from the practices of some of the virtual communities born around the social softwares we have mentioned above, whose trust metrics are allowing them to transcend the whole notion of local vs. global, horizontal vs. vertical, and the key to success is finding a way to automate this building of trust.

An interesting example is Outfoxed, which combines the Foxfire browser with a plugin and a simple server that has the object of trading trust information via RSS. Users would register with the server and they would be able to rate the trustworthiness of any online resource. When users go to a site that holds content that has been rated by another or other users, they would see the trust rating the content was given, and would therefore be able to decide right away whether to pursue the content and read it, or to discard it and move on. Geoffrey believes that this system has great potential, and I agree with him. He suggests how the system could be taken a step further, with the trust rating being embedded in Google search results, so that when a screen full of search results comes up, each entry would also display a button next to it with its "trusted" or "not trusted" rating. The members of any given network of users could therefore all work together so as to avoid duplication of efforts. It would not be necessary for everyone to vet an article or book, because one or a few members of the community would have done it for them. You could search for research you are interested in and find results that have been preemptively vetted and screened for you, saving you immense amounts of time and effort. You could then proceed to read only the content that has been deemed trustworthy and valid. I found this part of Geoffrey's theory to be fascinating.

In my research, I was interested in highlighting how these social software applications can help librarians and publishers get information out to researchers that will help them further their work without unnecessary "slogging" through mountains of material in search of substantive, authoritative work, and in bringing to light the ways in which content that appears in these "informal" settings (blogs, wikis, etc.) can itself be deemed trustworthy.

So I followed Geoffrey's advice and set out to learn about TRUST METRICS. The most simple definition of a trust metric is, as this Wikipedia entry states: "a measure of how a member of a group is trusted by the other members." An important concept that is introduced in addition to the trust metric is that of Attack Resistance, which is a measure of a trust metric's ability to ward off abusers of the network, like spammers, for instance.

I also found another interesting article on Trust Metrics, called: Trust Metrics for Social Networks, on Facebook. The article talks about implementing a trust metric to social networks in business, to help the members of the network apply the principles of Wikinomics in business settings. And another wiki on Trust Metrics, that outlines the basic principles of a network in a virtual community, speaks of the various issues of trust and attack resistance, and talks about local/global, and objective/subjective trust. It links to all the articles I have already read, so the circle is now closed, and I will move on to the interview with Geoffrey and my conclusions.

Tuesday, July 10, 2007

London/New York, day 15, July 1, 2007

Sunday: we are scheduled to fly at 4:00 p.m., but because of the terrorist threats and attacks here in England, security will be heightened, so we must leave early.

Our last breakfast in the dorm near ours, where we have been eating every day for two weeks, was very nice. The kitchen staff have been very friendly and have made us feel at home, customizing our eggs for us every day, requesting porridge on days when it was not scheduled. They all gave us a warm send-off.

Then back to the dorm to pack, and off to the tube (we've become quite expert at navigating the subway) to Heathrow. Because of the security issues, it took us over an hour just to get to the check-in counter, another hour to get through security, and another half hour to get to a second security that was just for our shoes. Why we couldn't do the shoes together with the belt buckles remains a mystery...

We had barely enough time to rush to the gate, run onto the plane and get settled before we were able to wait on the runway for an hour so that their paperwork and the effective number of passengers could be reconciled. It didn't really bother us. The plane was much more comfortable than the one we flew out on. De. and I decided that we like Virgin Atlantic very much and would definitely fly with them again. Their planes have names, I like that.

Just as the flight to London had been, this one was also completely uneventful, just the way we like them. Very smooth, no turbulence, nice flight attendants, a lot of entertainment choices, personalized for each passenger (another winning feature).

I read the entire book of 84 Charing Cross Road, and cried my eyes out. Very nice.

We arrived in New York right around dinner time and De's husband picked us up in the car. They drove me to my new house, where I slept for the very first time. Very strange, but at the same time strangely familiar and comfortable.

A few days have gone by now, and the house is more and more like a home. The ground floor is not yet finished, so I don't have a kitchen yet, but soon enough...

Stay tuned for my next posting, which will be the literature review on my topic of choice for my paper, which will be on user-generated e-published content (like this blog, for instance). I am still formulating exactly which angle to explore, but I expect that as I read current journal articles it will crystallize. Writing this blog has given me much food for thought, of course.

Until then, my dear friends and family, goodnight and much love to all.

London, Day 14, June 30, 2007

Saturday, one final day of nothing but fun, and tomorrow, home again!

This was a great day, that De and I spent together by ourselves, with nothing on the agenda but what we truly wanted to do, slowly, taking out time and having fun.

First we had breakfast in the dorm cafeteria, as usual, and fortified ourselves for the walking to come. Then, finally, I got to do what I really, really wanted badly to do, which was head to Charing Cross Road and visit the bookstores, but mainly Murder One, the prime crime bookstore in London. I bought 17 books, and felt much better right away. All the aches and pains of two solid weeks of lectures just melted away, book by book. Ahhhhh, the satisfaction. Among the books I bought was the classic 84 Charing Cross Road, which I intend to start, at least, on the plane to New York tomorrow.

When we came out of the bookstore, we realized that there were several theaters on that street, as well as bookstores. We looked at the posters and decided to see if they had tickets for this evening for The Letter, a play based on a story by Somerset Maugham. They did, and we bought them. The play begins at eight, so we have to get going if we're going to get anything else done during the day.

After the bookstore, we decided to go to the Tate Britain. We took the tube for a little, then got off and started walking. It would have been a very pleasant walk were if not for the fact that it started pouring. This was the only time, really, that we got absolutely soaked to the underwear. I wouldn't have minded so much if we hadn't had so much rain for the whole two weeks, and also we then had to be inside an air conditioned museum in wet clothes. But we didn't let anything dampen our spirits. We pressed on, got to the Tate, took off our outer garments and had the coat check people hang them to dry, and proceeded to walk around the museum.

De, in her infinite wisdom and preparedness, had reserved lunch in the restaurant. We went first to see the Turner exhibition, which was really great. It was so nice to be able to walk around and enjoy something without having to take notes and write all about it. I will say only that we filled our eyes with beauty, read lots of captions, stood around, walked around, and just soaked it all up.
Then we headed to the restaurant and had a very nice and fairly light lunch. The best part, though, was yet to come. Those who know me know that I came to England with one very specific goal: to have a real English tea with scones and clotted cream and jam. Well, until today this supposedly simple pleasure had been denied me. But, the ever resourceful De had noticed that they served tea at the Tate.

So, after lunch we wandered around some more, saw some more art, and finally went back to the restaurant where the waitresses were kind enough to give me two scones instead of one, since I had come from so far away to enjoy this mid-afternoon treat. And what a treat it was!!! It was the tea of my dreams. Clotted cream, jam, delicious scones, and a nice pot of tea just for me, which they refilled as well! In a word, HEAVEN.

After the museum, we went back to the dorm to rest and freshen up briefly before heading out again to the theater. Back to Charing Cross Road we went, and watched an enjoyable performance of The Letter. A woman kills a man, her husband believes her implicitly when she tells him the victim had tried to rape her, an investigation and a trial ensue. She is acquitted. And only after it is all over, because of a letter with which she is blackmailed, does the whole sordid truth come to light. The man she killed was her lover for years, but had discarded her in favor of a Chinese woman who was (dare I say it?) OLDER than he!!! Well, as we all no, hell hath no fury... and so, his fate was sealed. The cuckolded husband, however, takes her back, and her penance is that she will be a good and faithful wife to the end, though still in love with the man she killed. A good yarn, all in all.

After the theater, back to the dorm and to bed. For tomorrow, we fly home, home to New York, where my new house awaits, and where I will sleep for the very first time!!!

To all my friends and loved ones, goodnight!

Monday, July 9, 2007

London, Day 13, June 29, 2007

Friday, the second day of the "1st Bloomsbury E-Publishing Conference". How lucky we are! It's the last day, the quality of the speakers is truly superior. We are going out with a bang, for sure. Yesterday the focus was on e-books, and today it was on e-journals.

The first speaker of the day was Dr. David Prosser, of SPARC Europe, whose talk was titled: The fourth driver of change -- Everything should be open. The acronym SPARC stands for Scholarly Publishing & Academic Resources Coalition, and the UK coalition was formed in 2002 after the success of SPARC US launched by the ARL.
The first half of David's presentation was devoted to outlining several "mission statements" of various international and national organizations: The Lisbon Agenda brought together the heads of the EU states in 2000, where they stated as their goal that of making the Eu the most competitive knowledge-driven economy by 2010 - the strategy to be employed was a transition to a knowledge-based economy. As for the UK, it was stated that "we want the UK to be a key knowledge hub in the global economy, with a reputation ... for turning that knowledge into new and profitable products and services."
He noted that with increased spending on R&D there arises a need for increased assessment of Educational Institutions, Researchers, etc; the need for more ways of measuring citation statistics, who is citing whom, and a desire to streamline this process.
At some point it became apparent that in order for scientific knowledge to progress, there must be a technologically advance way for scientists to share research, results, resources. There is a need for integration, federation, information analysis; the need to access and control remote experimental equipment. This is his definition of E-Science.
This is where Institutional Repositories come in. They will increasingly become part of the infrastructure that allows E-Science to take place across all boundaries.
In 2004, the OECD Committee for Scientific and Technological Policy agreed that "optimum international exchange of data... contributes decisively to the advancement of scientific research and innovation." The OECD actively began to promote Open Access, and declared their commitment to "openness, transparency and interoperability." As examples of successful collaboration across geographical, political and economic barriers, he cited the Genome Project, for which several research labs in different countries all shared data and the project was able to progress several times faster than it would have, and with probably better results, than if one country had gone it alone.
He also spoke about the MRC's Policy on Data Sharing and Preservation. The MRC believes firmly that the results of publicly funded research should be freely available to anyone, as they are sought and achieved for the common good.
In the traditional publishing setup, there is dissatisfaction at many levels. The authors are unhappy because their work is not sufficiently visible to their peers, and because they give away certain rights for publication, they themselves cannot dispose freely of their work. And then readers cannot access all the literature they need.
And here the call for Open Access comes in. As David defines it, Open Access is "the call for free, unrestricted access on the public internet to the literature that scholars give to the world without expectation of payment."
In the context of open access he mentioned the Budapest Open Access Initiative, based on the twin strategies of having scholars deposit their refereed journal articles in open archives and having open access journals charging no subscription or access fees.
He described institutional repositories, pointing out among the usual characteristics, the fact that they can function as full CVs for the researchers themselves. Then he talked about journals, both traditional and open access, and said that the difference between the two is the peer review (which we have seen is not strictly true, because there are many open access journals that are peer reviewed).
He went on to talk about the OpenDOAR (Directory of Open Access Repositories), an authoritative directory of open access institutional repositories; strategies for making the transition from the traditional publishing model to the new open door model; the advantages of self-archiving (papers in OA repositories are cited on average twice as often as their counterparts); and of the Berlin Declaration in Support of Open Access, based on the premise that the mission of dissemination of knowledge is only half complete if access to information is not free for everyone.
This was a very dense presentation, and I encourage readers to follow the links.

The second speaker of the day was Geoffrey Bilder, of CrossRef. The topic of his presentation was The fifth driver of change -- The disruptive power of technological advance. Geoffrey is the Director of Strategic Initiatives at CrossRef. Over the past fifteen years, he has acquired experience in technology and how it can be used to support scholarly pursuits, whether they be teaching, researching or communicating among scholars. In the most recent past, before joining CrossRef, he consulted with publishers and librarians on how the emerging social software technologies may affect researchers and how best to use them so they can help in the field of scholarly and professional research. It's obvious that this has become the focus of his own research and work, and his presentation was fascinating. His speaking style was very engaging and I was not able to take many notes because I just wanted to listen and absorb as much as possible, not just of what he was saying, but of the implications of the things he was telling us about.
First he outlined the current situation of the Internet, by showing us the graph of the Gartner Hype Cycle, which describes how the hype around new technologies inflates expectations and encourages the early adopters to purchase them in droves. At the height of the curve is where the sales are high and it's too early for disappointment to have set in -- here, he made us laugh by telling us that this is where the new technology gets on the cover of Wired. Then comes disillusionment, where people discover that whatever the new gadget is, it does not open the doors of Nirvana. After that, comes the long tail, the slow re-adoption by the early adopters who stick with it, and finally there is a long gradual slope of adoption, and if the technology has something to offer it will plateau and become a commonly used item.
Next, Geoffrey outlined the situation among scholars and researchers today. In a nutshell, there is so much information out there that it's simply daunting. People don't want to read, and the more stuff is out there the less time there is to read each article or other piece of information they come across. Blogs of all types are having healthy lives, and apparently more than 120,000 blogs are created every day (I found this unbelievable -- I believe it, but it's a staggering number).
Then he went on to outline how the decline of publishers' value chain has led to the need for a new system of trust. This is the key issue in the world of publishing right now: Trust. What publishers have traditionally furnished is exactly that, trust. The editorial process guarantees that the output of official publishers has a seal of quality that researchers, scholars, students, teachers all rely on for the furthering of their own work.
Internet users are subject to all kinds of disturbances that diminish their trust in the resources they find: spam, viruses, etc. Geoffrey described the way the Internet currently functions as a "trust anti-pattern" which is touted as a non-hierarchical distribution of specialist or scholarly content, while in fact there are hierarchies in place just as there are in other more traditional publishing settings. When the hierarchies are lacking, the system breaks down into a chaotic jumble of information. So automated and human-driven regulatory systems are put in place to restore order, once again establishing a hierarchical structure. On the internet trust tends to build up horizontally, among peers, and at a local level. It is difficult for this kind of trust to scale upwards and outwards. Scholarly trust is handed down from above, as it were, and while it can be extended and become more global, it is also more subject to abuse.
So, how to avoid this trust anti-pattern? The more successful internet ventures on a global scale are the ones that have understood the need to implement trust-creating mechanisms. Geoffrey outlined the various methods that have been implemented by e-bay, Amazon, Google, Slashdot, where each has introduced ways in which content can be "rated" by various means, which allows higher quality content to gain trust while gradually pushing to the bottom content of inferior quality. On e-bay buyers rate sellers; in Amazon they created reviews; Google's method is invisible, but trust is measured in terms of numbers of links. If many people link to a site, it must mean that the site contains trustworthy material, so the site rises in the ranks and appears in search results in a more prominent position. Slashdot is a kind of blog that allows readers to post comments, and according to the kinds of things written, people gain more or less karma. These trust establishing mechanisms are called "trust metrics". Trust metrics are limited to the content of each of these sites, of course, so we have to ask ourselves how we can help to create an environment in which serious researchers and scholars can look for and find authoritative content.
Our role as publishers and distributors of scholarly content is to help researchers know what they should be paying attention to:

  • Blogs: stm, scienceblogs...
  • Wikis - not really a broadcast mechanism
  • Social bookmarking/categorization
  • RSS feeds

I found this part of his talk to be the most interesting, because it's proposing new ways of gathering and disseminating knowledge. I was particularly interested in a concept that he called "subscribing to a person's or a group's brain." Geoffrey himself has a blog, Louche Cannon (by his own admission he hasn't been very good at keeping it up, and I think the most recent post is from March of this year), and if you check it out you will see that he has this great little button that says "My Brain" on it. It is clickable, though it cannot yet interact automatically with browsers, so you cannot use it interactively the way it is intended to be used. It is an OPML file. The idea is to collect in one place one's website, delicious page, flickr page, connotea page, library thing page, citeulike page, and so forth. This way other people can share these resources. I am extremely interested in the social bookmarking/categorizing services like delicious and connotea and I plan to investigate all this further.
In Geoffrey's words, links are votes. The more people connect to a site, the more trustworthy it becomes. The implications of social software: the more high-trust specialists use them, the more they become... PUBLISHERS in their own right.
Geoffrey's theory is that the Internet should be used more and more as a database, and gave us a simple outline of what that database would look like: a grid of rows and columns, where rows are things, columns are attributes, and the nodes at the intersection between rows and columns are: things' attributes.
This talk was perhaps my favorite. I attribute this to a few things:
  1. I have been writing this blog and am therefore becoming keenly aware of the challenges, the implications, and the meaning of "user-generated content" on the internet;
  2. Geoffrey's manner was so lively and engaging that I really felt that this stuff on the internet was dynamic, capable of movement and change;
  3. Publishing interests me very much, and personal initiative also appeals to me.

The third speaker of the day, and the last before lunch, was Dr. Michael Jubb of Research Information Network. The topic of his presentation was : The sixth driver of change -- Changes in scholarly communication. Michael has held a variety of posts in settings both academic and official, and his resume is quite daunting. Most recently he held very lofty positions at the Arts and Humanities Research Board (AHRB), leading its transition to full Research Council status (now known as the AHRC). The RIN, which he joined as director in 2005, has as its goal to help researchers in all fields (STM as well as the humanities and arts) access research information, mainly in the UK. In his presentation Michael outlined how the RIN functions and in doing so also shed light on the way researchers are using and producing new information today.
The two core activities of the RIN are:
  • to act as an observatory that analyzes two kinds of information: a) the trends in the world of information services for researchers in the UK, and b) how researchers are using these services, and what obstacles they face;
  • to develop strategic advice and guidance to key stakeholders in the research world on ways to develop policy frameworks for information services that might be developed in the future.
The RIN was founded on the basis of a realization that effective information services play a big part in research. Michael pointed out that the UK has 3% of the world's researchers and they produce 8% of the world's scholarly articles. They are second to the US in certain areas, first in others (among which is productivity).
In order to understand how this level of productivity can be maintained, it's necessary to study researchers' behavior as information users, and as creators of information and developments in scholarly communications.

Researchers as INFORMATION USERS:
  • What to they want to find and use?
Resources, articles, expertise, datasets, original text sources, etc.
  • What discovery services do they use?
Ranked discovery services, search engines, specialist engines, colleagues, abstracting and indexing services, citation indexes, libraries, blogs, etc.

The long tail of discovery services: in a graph Michael showed us 221 discovery services and sources, among which the most popular were Google, Google Scholar, PubMed, Web of Science, etc. But then there is a long tail of a huge number of highly specialized sources.

I liked hearing that library catalogs are heavily used by all branches of research, particularly arts and humanities.

What is the issue (and we have heard this over and over again throughout the course) is the gap between discovery and access.
What researchers want is to be able to transition seamlessly from the citation to the full text of the articles they want. A lot of research is conducted online, and researchers are often frustrated by subscription barriers that prevent them from accessing the full text.
What is also alarming is the lack of familiarity with Open Access content on the part of the researchers. Very few use OpenDOAR or other repositories, unless they stumble upon them by accident. Librarians are the most familiar, followed by the nature sciences researchers, trailed by the arts and humanities scholars. This picture makes sense if you consider that the life sciences researchers are the least likely to frequent the library. The increased amount of online research leads to their increased knowledge of resources that are open access. Libraries subscribe to many databases and full-text journals, so whoever conducts research at the library gets access to all these resources.
When asked which resources are the most useful to them, the answers were overwhelmingly in favor of e-journals (less so for the humanities).

Researchers as INFORMATION CREATORS.
Key outputs are journal articles and data.
There are concerns about data management (a deluge of information); about a lack of clarity as to roles and responsibilities.
Food for thought: He talked about Virtual Research Environments and Communities. Half of researchers and 75% of librarians think that they will revolutionize the field, while the other half of researchers have never heard of them.
Most UK researchers still publish with subscription-based journals, some with hybrids and the smallest number with free, open access journals. When asked whether their institution possessed a repository, most researchers did not know the answer, while most librarians did.
At the end of his talk, Michael summarized by going over what we need to know more about, in order to foster growth and healthy development in research. What we need to know more about: how researchers do their work; what resources they use; the differences in methods and means between different disciplines; and what's going on at the cutting edge, but also in the long tail.

At this point we had our lunch break and enjoyed a little timid sunshine in the garden outside the Garden Room where we had lunch.

The afternoon session began with a presentation by Martin Richardson, the managing director of Oxford Journals. His topic was: Overview -- where are mainstream journal publishers with new models? Martin has spent most of his professional life in academic publishing. Among other positions held at Oxford University Press, he has been the Director of the Oxford English Dictionary (I can hardly imagine anything more wonderful!). He is currently responsible for the publication of over 200 journals. I wish I could interview these people individually, because they're all full of surprises. In a previous incarnation, it seems that Marting edited books on chess (!!!) and also managed a bookshop.

In his presentation, Martin addressed the pros and cons of traditional, subscription-based journal publishing, as well as those of the open access model. But his real focus was on a hybrid model, which was very interesting. OUP has been conducting experiments whose goal is to discover whether Open Access journals will be more widely disseminated than subscription ones. Of course, a successful business model must be financially viable.
He used a specific journal to illustrate how they are transitioning from the traditional model to the new one: Nucleic Acids Research. This journal used to be subscription only, and therefore a large percentage of income was generated by the subscriptions. After the Open Access model was introduced as an option in 2005, almost 50% of the income is now coming from authors. There is a rate chart in the slides of the presentation, showing that there is a member rate, a non-member rate, there are waivers for developing countries and authors with financial difficulties. This model seems to be working.
Food for thought: the addition of open access content does not seem to have made the number of subscription sales decline in any significant way.
As far as the physical management of files goes, they have an Institutional Repository in which they store abstracts, metadata, bibliographic info, indexes, and url's that lead to the pdf of the full text. In other words, they do not store the articles themselves in the repository.
A project Martin mentioned is OUP's SHERPA project. I'm pretty sure this project has been mentioned before, but briefly: it is a partnership of 26 Higher Education institutions in the UK who have banded together to create open access institutional repositories. In addition to the functionality of the repositories I have outlined above, authors are also able to self-archive if they should so choose.
Martin's conclusion is that this hybrid model seems to be working. He also summed up by saying that the evolution of these new business model/s will depend largely on : technological developments and constraints, politics, research funders and library budgets.

Next up was Leo Walford of Sage Publishing. His talk was titled: Making journals more accessible. We don't know much about Leo's background, except that he is a leading journals marketer, which is all we need to know for the purpose of this talk. The best part about this talk was that it organized a coherent picture of the current relationship between libraries and publishers. Libraries are concerned mainly with giving their patrons maximum access to the best resources. Publishers are concerned with increasing or at least maintaining their revenue. So the question is: what are publishers doing to accommodate libraries, in other words, to increase access?
  • The big deals
  • licensing
  • donation schemes
  • pay per view
  • new pricing/access models
  • Open Access

Leo then went on to describe how each of these work. We have heard a lot about the big deal, so I won't describe it again, save to say that he too reaches the conclusion that the big deal is going to be sticking around for the foreseeable future, because it is too convenient for both publishers and libraries. In the context of licensing he talked about aggregators, who license large bundles of content and pay royalties to the publishers based on usage.
Donation schemes are interesting. Publishers have developed them as part of various projects aimed at providing access to journal literature to developing countries and other underfunded groups. This method of dissemination is a valuable publicity tool for publishers. There are different ways in which these schemes are implemented. Some recipients pay nothing, and others pay a token sum for their subscriptions. It's a win-win solution.
The pay-per-view is something we've already heard quite a bit about, but it's an interesting scheme, because it is akin to micropayments, and also aligns itself with the new way of thinking in smaller and smaller "bites" of information (chapters vs. books, etc.). Pay per view has shown promising signs of working quite well for publishers and libraries alike, though libraries always complain of the difficulty of budgeting in advance for things they can't foresee.
Libraries and funders, and to a degree also publishers, are looking for new payment schemes which might provide more flexibility, be cheaper overall, provide accountability and be simpler.
Some of the pricing models that are being considered are:
  • national license
  • pay per view converting to subscription
  • core + peripheral pay per view
  • value-based pricing
  • Open Access
National license is the practice of paying a fixed amount up front for a limited access to all the content of a publisher. So far it has worked in limited, circumscribed environments, but not on a large scale.
Pay per view conversion is something that seems apparently straightforward but that in practice has proved unwieldy and is not attractive to libraries or publishers.
Core + peripheral as a concept is a basic subscription with pay per view of content that is not purchased on subscription. This too is not very practical and leads to disagreements of what should or should not be considered "core".
Value-based pricing is supposed to be calculated on the basis of several parameters, like impact factors, number of downloads, number of articles published, and so forth. I'm not sure how this method is received.
Open Access is being offered more and more widely, and there are all kinds of hybrid offerings.
In conclusion, Leo doesn't see any major revolutions happening in the near future. All the new pricing methods are being adopted to some extent, but not in the widespread way one might expect. On the positive side, he does not seem to think that Open Access represents a serious threat to publishers.

The next speaker was Matthew Cockerill, of BioMed Central, and his talk was titled: New, emerging, and potential models. Matthew did not send Andy his PowerPoint presentation, so I can't provide a link to this talk, but I'll do my best to reconstruct from my notes. Matthew's background is really impressive. He cofounded BioMed Central in 1999 and is responsible for all aspects of their publishing activity. Before that he spent four years at BioMedNet, where he headed many important projects. He has a degree in Natural Sciences and a PhD in Biochemistry.
Having spend an entire afternoon at BioMed Central, I was not expecting to hear anything particularly new from Matthew, and at first his quiet manner of speaking fooled me into thinking that his talk was going to be boring. But within minutes I was quite riveted. He has a quiet passion about him that indicates a firm belief in what he is doing. It's clear that he is driven in his desire to push Open Access journals to the forefront of the e-publishing industry.
He used one journal as an example, the Malaria Journal, ranked first in its field. He explained the pricing system: there is a Article Processing Charge (APC) - This APC can be paid for by the researchers themselves out of their grant money, but is increasingly being paid by their parent institutions or by grant-giving funders like the Wellcome Trust. Matthew used this example in order to examine the eternal question of financial viability for Open Access.
BioMed Central's financial model has been evolving over the years, with a varied pricing structure and they expect to break even this year. The more selective journals charge more for the APC, which reflects a greater editorial involvement and therefore higher production costs. By encouraging institutions to pay the APC, the authors themselves are free from financial constraints and can choose freely whether to publish in traditional, subscription-based journals or in Open Access journals.
Matthew says that the fact that they are on the way to breaking even is to be attributed to the fact that their processes are highly streamlined. This streamlining allows for quite a bit of flexibility. They are constantly adding new journals to their roster, and recently they have begun to add some entirely new ventures. An attractive publication he described to us is the Journal of Medical Case Reports. These are shorter articles, with a lower APC of only 250 pounds (as opposed to the usual 750-1500 pounds for other journals).
Matthew pointed out that there is a lot of valuable scientific knowledge (like that gained in clinical settings) that is not yet captured in formal publications. Journals like the Medical Case Reports can solve this problem in a way that is inexpensive while offering a lot of exposure. I found this part of the talk very interesting, because I think this line of publishing will have a very healthy future, with a lot of room for growth.
In closing, Matthew mentioned that Open Access and paid for content need not be mutually exclusive, pointing out that commissioned content (Genome Biology, Breast Cancer Research, etc.) can still be by subscription, while research articles should be Open Access (this aligns itself with those who pointed out that publicly funded research should be available to the public at large).
Then he mentioned Faculty of 1000 (which we heard about when we visited BioMed Central) - a subscription-based online literature awareness service built from the aggregated opinions of specialists.

At this point we all retired to the Garden Room for much needed tea and refreshments. And a short twenty minutes later, went back to Darwin Theatre for the last two speakers.

First up was Sue McKnight of Nottingham Trent University, where she is the Director of Libraries and Knowledge Resources, serving three campuses and a total of 25,000 students. Prior to that she was in Australia, always in academic libraries, where she received awards for outstanding management skills. She has long been interested in pioneering e-learning and is a board member of various organizations, such as IFLA, JISC, SCONUL, etc. Her talk was titled: What models suit librarians?
Sue sent a questionnaire to the SCONUL Director's list with the following questions about e-journals from the point of view of the librarians:
  • What you hate
  • What you love
  • What you would change if you could
She received 28 responses from 20 different libraries. First on the list of hates was the VAT (Value-added tax), that librarians feel should be much cheaper for e than for print. And publishers agree with this. Of course the small libraries find big deals too expensive and cumbersome for their small budgets. They also don't like to be locked in for long periods of time, as well as the difficulty in cancellation policies. Also holdings can drop in and out of packages, and different services have different passwords, making navigation difficult for patrons, especially students and faculty. Also, many of these packages have implementation practices (federated searches, link resolvers) that leave too much of the work up to the libraries.
On the love side, there is the general ease of access and use, and full text, which is much appreciated by all. There are good searching facilities. Everyone likes the use of DOIs.
As to what they would change: in a perfect world there would be little or no VAT, pricing models would be simple, there would be perpetual access to content that has been previously licensed; one sign-on would give access to all the journals; interfaces would be clean and intuitive; overlap between aggregators would be eliminated; access would be extended to associates of the libraries, walk-in readers, etc.; there would be more flexibility in changing titles in the package; federated searching would be simpler; -- and here is something I liked to hear -- there would be more art and design e-journals, with great image quality and everything online; there would be more competition among publishers; publishers would support developing countries with free access to knowledge and they would support Open Access.

While this talk did not shed light on any really new information, it was delivered briskly and engagingly, and as always, seeing things laid out clearly is always helpful.

At this point the publishers and presenters were invited to come down to the podium and have a panel discussion with the audience. Several publishers and librarians asked questions, and a few students as well. I didn't take notes for this, preferring to just listen to the debate and rest for a few minutes. There was only one speaker left, and he was supposed to be a big star...

Finally, we come to the last talk of the conference. The famous Richard Charkin came to address us in closing. I say famous because we had been hearing about him for days, and I have to say that it's a pity there was so much anticipation. Possibly because of the late hour, or because he thought we had heard too much already. Or possibly he simply didn't really prepare for this talk, I'm afraid this was somewhat of a letdown. Charkin is Chief Executive of Macmillan Publishers, and he has been involved in publishing since 1971. There is no doubt that he has a great deal of knowledge to impart, but today was not really that day. His talk was : Overview, Commentary and Insights. He spoke only briefly and rather sedately (we had been told that he was a real showman and I was really hoping for some theatrics). He told us a few entertaining anecdotes of his youthful days in publishing (stories along the lines of "the one that got away" -- along the lines of turning down Harry Potter).
The main points he wanted to get across were that we have to rediscover the reader and the writer, the real customers. The developing countries are the giants of tomorrow and that is where the market will be. We have to experiment, which costs money, but it's unavoidable. And publishers have to accept the fact that their margins will become lower.

Thus ended our extraordinary Conference!!!!!

By now it was about five o'clock in the afternoon, and we all repaired, once more, to the Garden Room, where we had celebratory champagne, took group photos, got certificates and graduate teddy bears, and Andy Dawson regaled us with a lusty rendition of The Bold Librarian (follow the link for the full text). A wonderful send-off, all in all.

Here is Andy taking a picture of the whole group. He had to do this at least ten times, since everyone wanted to give him their camera!

And here are De and Anthony, with the satisfied smiles of the just(ly rewarded with champagne)!

And here is dear Andy handing out our certificates of graduation from the first ever UCL Summer School in E-Publishing in partnership with Pratt SILS. Hurray for us!!!


And here, finally, is our group graduation picture. Anthony is in the last row, far right. Tula is right in front of him. Andy is in back, far left. Let's have a big hand for all of us!!!!! We did it guys! We really did it!