Books 'R' Google
By Robert
Darnton
The New York Review of Books
Volume 56, Number 2, February 12, 2009
Edited by Andy Ross
Google has digitized millions of books and made the texts searchable online.
When fields of knowledge turned into professions and university departments,
professional journals sprouted throughout the fields. Commercial publishers made
a fortune by selling subscriptions to the journals. They could ratchet up prices
without causing cancellations, because the libraries paid for the subscriptions
and the professors did not. And the professors provided free labor: they wrote
the articles, refereed submissions, and served on editorial boards.
When businesses like Google look at libraries, they see potential assets,
content, ready to be mined. Built up over centuries at an enormous expenditure
of money and labor, library collections can be digitized en masse at relatively
little cost. To digitize collections and sell the product in ways that fail to
guarantee wide access would be to repeat the mistake that was made when
publishers exploited the market for scholarly journals, but on a much greater
scale.
Four years ago, Google began digitizing books from research libraries, providing
full-text searching and making books in the public domain available on the
Internet at no cost to the viewer. Google collected revenue from some discreet
advertising attached to the service. Google also digitized an ever-increasing
number of library books that were protected by copyright in order to provide
search services that displayed small snippets of the text. In September and
October 2005, a group of authors and publishers brought a class action suit
against Google. In October 2008, the opposing parties announced agreement on a
settlement.
The settlement creates a registry to represent the interests of the copyright
holders. Google will sell access to a gigantic data bank composed primarily of
copyrighted, out-of-print books. Organizations will be able to subscribe via an
institutional license for access to the data bank. A public access license will
make this material available to public libraries. Individuals will be able to
access and print out digitized versions of the books by purchasing a consumer
license. Google will retain 37 percent of the revenue, and the registry will
distribute 63 percent among the copyright holders.
Of the seven million books digitized by November 2008, one million are works in
the public domain, one million are in copyright and in print, and five million
are in copyright but out of print. Google will continue to make books in the
public domain available for users to read, download, and print, free of charge.
Many of the books in copyright and in print will not be available in the data
bank unless the copyright owners opt to include them. They will be sold as
printed books and perhaps also as digitized copies via the consumer license.
Most of the books covered by the institutional license are in copyright but our
of print.
The proposal could result in the world's largest digital library. Google could
also become the world's largest book business. Virtually all books will be
brought within the reach of anyone with access to the Internet. Not only will
Google bring books to readers, it will also open up extraordinary opportunities
for research.
Google did not set out to create a monopoly. But the class action character of
the settlement makes Google invulnerable to competition. Most book authors and
publishers who own US copyrights are automatically covered by the settlement. No
new digitizing enterprise can get off the ground without winning their assent.
This outcome was not anticipated at the outset. We missed a great opportunity.
We could have created a National Digital Library. It is too late now. Not only
have we failed to realize that possibility, but we are allowing the control of
access to information to be determined by a private lawsuit.
Google will enjoy a monopoly of access to information. Google has no serious
competitors. Google alone has the wealth to digitize on a massive scale. And
having settled with the authors and publishers, it can exploit its financial
power from within a protective legal barrier. No new entrepreneurs will be able
to digitize books within that fenced-off territory. Only Google will be
protected from copyright liability.
This is a tipping point in the development of the information society.
Google Book Search
By Robert
Darnton
The New York Review of Books
Volume 56, Number 20, December 17, 2009
Edited by Andy Ross
Google has by now digitized some ten million books. On what terms will it make
those texts available to readers? The terms of the settlement will have a
profound effect on the book industry for the foreseeable future.
Google plans to enable consumers to purchase access to millions of copyrighted
books currently in print, with payment going to authors and publishers as well
as Google. Books covered by copyright but out of print, at least seven million
in all, will be available through subscriptions paid for by institutions such as
universities. The database, along with books in the public domain that Google
has already digitized, will constitute a gigantic digital library.
But Google's dominance of access to books will reinforce its power over access
to other kinds of information, raising concerns about privacy, competition, and
commitment to the public good. As a commercial enterprise, Google's first duty
is to provide a profit for its shareholders, and the settlement leaves no room
for representation of the public.
Google Book Search (GBS) will certainly be challenged by groups and individuals
who claim they were not fairly represented in the classes of authors and
publishers. The case may take years to work its way through the courts. As the
first step toward a resolution, the filing on November 13 suggested just how far
Google is willing to go in modifying the original settlement.
The governments of France and Germany urged the court to reject the settlement.
Far from seeing any potential public good in it, they condemned it for creating
an "unchecked, concentrated power" over the digitization of a vast amount of
literature and for doing so by a "commercially driven" agreement negotiated "in
secrecy." In contrast to the commercial character of Google's enterprise, both
governments stressed the higher values represented by their national
literatures.
The French emphasized the unique character of the book, which, they claimed,
would be compromised by Google's commitment to commercialization. The Germans
spoke in the name of "the land of poets and thinkers," but they laid most stress
on the right of privacy, which, they argued, Google could threaten. Both
governments then listed a series of subsidiary arguments:
1. The settlement gives Google a virtual monopoly over orphan works, even though
it has no claim to their copyrights.
2. Its opt-out provision, which means that authors will be deemed to have
accepted the settlement unless they notify Google to the contrary, violates the
rights inherent in authorship.
3. It contains a provision that prevents a potential competitor from obtaining
better terms than Google in any new commercial uses of the digitized books. The
terms of such future enterprises will be determined by a Books Rights Registry
composed of representatives of the authors and publishers.
4. It gives Google the power to censor its database by excluding up to 15
percent of the digitized works.
5. Its guidelines for pricing will promote Google's commercial interests, not
the good of the public, through the use of algorithms created by Google
according to Google's secret methods.
6. It favors secrecy in general, hiding audit procedures, preventing the public
from attending meetings in which Google and the Registry will discuss library
matters, and even requiring Google, the authors, and publishers to destroy all
documents relevant to their agreement on the settlement.
Above all, the French and Germans condemned the settlement for sanctioning the
"uncontrolled, autocratic concentration of power in a single corporate entity,"
which threatened the "free exchange of ideas through literature."
The same points were made in a hearing before the European Commission in
September by the International Federation of Library Associations (IFLA), the
European Bureau of Library, Information and Documentation Associates (EBLIDA),
and the Ligue des Bibliothèques Européennes de Recherche (LIBER).
All three stressed the danger that "a large proportion of the world's heritage
of books in digital format will be under the control of a single corporate
entity." They summoned up the prospect of a digital library of 30 million books
and concluded that Google would exercise something close to hegemony in the book
world. They appealed to the European Commission to defend the interests of the
public.
The U.S. Department of Justice pointed to serious difficulties with the
settlement and suggested the following changes:
1. Require rights-holders of out-of-print books to participate in the settlement
by opting in instead of operating from the assumption that they had agreed to
participate unless they opted out.
2. Do not distribute the profits from the sale of orphan books to the parties of
the settlement but rather use the money to fund a thorough search for the
unknown rights-holders.
3. Appoint guardians to protect the interests of orphan rights-holders by
serving on the registry.
4. Find some mechanism by which potential competitors to Google could gain
access to orphan works without exposure to suits for infringement of copyright.
5. Prevent Google from using out-of-print works in new commercial products
without the owner's permission.
The revised settlement, or GBS 2.0, released on November 13, reads as if Google
and the plaintiffs took most of their cues from the DOJ recommendations. GBS 2.0
provides that the Registry will include a court-appointed guardian to represent
the rights-holders of unclaimed books. But Google alone would enjoy immunity
from prosecution by any rights-holders.
As to revenue from the sale of orphan books, GBS 2.0 accepts that the money not
go to Google and the plaintiffs but will be spent in efforts to search for the
unidentified rights-holders. GBS 2.0 also allows Google's competitors to license
out-of-print books in retail enterprises, although Google would maintain
exclusive control of the institutional subscriptions to its gigantic database.
How the prices will be set remains unclear. GBS 2.0 contains no effective
mechanism to prevent price gouging, no provision for a public authority to
monitor prices, and no way to protect the public from excessive pricing should
Google be taken over in the future by rapacious speculators.
GBS 2.0 does not therefore differ in essentials from GBS 1.0. It largely ignores
the objections of foreign governments, except by narrowing the scope of GBS to
books published in the United States, the United Kingdom, Canada, and Australia.
GBS will not cover books published in countries like France and Germany.
One can imagine two general solutions to the problems posed by GBS, one maximal,
one minimal.
The most ambitious solution would transform Google's digital database into a
truly public library. An act of Congress would clear up a messy legal landscape
and give the American people a national digital library equal to the needs of
the twenty-first century.
A minimal solution could be devised for the private sector. Congress would
legislate to protect the digitization of orphan works from lawsuits, but it
would not appropriate funds. To avoid conflict with market interests, the
database would include only books in the public domain and orphan works. At the
rate of a million books a year, we would have a great library, free and
accessible to everyone, within a decade.
The Future Of Publishing
By Jason Epstein The New York Review of Books Volume 57, Number 4,
March 11, 2010
Edited by Andy Ross
The digitization of the book publishing industry is now irreversible. The
publishing industry's capital stock faces dissolution within a vast cloud in
which all the world's books will eventually reside as digital files to be
downloaded instantly title by title wherever on earth connectivity exists.
Digitization makes possible a world in which anyone can be a publisher
and anyone can be an author. In this world, the traditional filters will
have melted into air and only the human inability to read what is unreadable
will remain to winnow what is worth keeping. Amid the chaos, readers will be
guided by the imprints of reputable publishers. The more adaptable of
today's general publishers will survive.
The difficult, solitary work
of literary creation demands rare individual talent and in fiction is almost
never collaborative. Until it is ready to be shown to a trusted friend or
editor, a writer's work in progress is intensely private. Informed critical
writing of high quality on general subjects will be as rare and as necessary
as ever and will survive as it always has in print and online for
discriminating readers.
The cost of entry for future publishers will
be minimal, requiring only the upkeep of the editorial group and its
immediate support services but without the expense of traditional
distribution facilities and multilayered management. Traditional territorial
rights will become superfluous and a worldwide, uniform copyright convention
will be essential. Protecting content from unauthorized file sharers will
remain a vexing problem. If I were a publisher today, I would consider a
renewable rental model for all e-book downloads.
Literary form has
been remarkably conservative throughout its long history. Actual books,
printed and bound, will continue to be the irreplaceable repository of our
collective wisdom. My rooms are piled from floor to ceiling with books. I
mention this so that you will know the prejudice with which I celebrate the
inevitability of digitization.
Googled
By John Lanchester The Observer, February 21, 2010
Edited by Andy Ross
Googled: The End of the World as We Know It By Ken Auletta Virgin
Books, 400 pages
No company in history has grown as fast as Google. Within 400 weeks of its
founding, it was earning revenues of $20 billion a year. The 1998 start-up
has reached deep into the everyday experience of millions, put itself in the
centre of the internet culture that is defining the new century, and had a
disruptive impact on some industries and a potentially terminal one on
others. Google is one of the wonders of the world.
Since Google's
mission statement is "Don't be evil", people hold it to a high standard.
Sergey Brin and Larry Page don't ask for permission: they do what they want
to do, and rely on the fact that people will understand the point of it
afterwards. The basic move in Google's rise to dominance was copying stuff
without asking. Don't ask for permission, and rely on the fact that people
will love the results when they see them. This model has stood the company
in very good stead, but it plainly involves an attitude in which innocence
and arrogance are emulsified together.
Auletta looks at the company
in its pomp, and sees problems and threats everywhere. At one point in 2008,
Google was offering 150 products. Only targeted advertising made real money.
YouTube lost $500m in 2009. Google's programme to digitise books has caused
a bitter backlash. That was an example of the no-permission policy going
badly wrong, because as Brin told Auletta, if they had asked authors and
publishers, "we might not have done the project".
Google's mission is
"to organise the world's information and make it universally accessible and
useful", but that doesn't extend to its own intellectual property, which it
guards with ferocity. As its share prospectus says: "Our patents,
trademarks, trade secrets, copyrights and all of other intellectual property
rights are important assets for us ... any significant impairment to our
intellectual property rights could harm our business or our ability to
compete." It's hypocritical to pretend that the same isn't true for
everybody else.
Google and Money
By Charles Petersen The New York Review of Books, December 9, 2010
Edited by Andy Ross
Google's search engine remains its single largest source of revenue.
Stanford graduate students Sergey Brin and Larry Page launched Google in
1998 with a new algorithm, called PageRank, that made use of the links
between sites to determine relevancy. Google became the best search engine
available but it also left Google with almost no source of revenue.
Google grew desperate for funding during the dot-com bust. Aside from page
views, one of the few easily measured statistics on the early Web was
click-throughs, the number of times visitors to a site found an ad displayed
enticing enough to click on it, and then be taken to the advertiser's own
website, where the product or service in question might be purchased or
used.
Google realized that ads on search engines reach users when
they are looking for something specific. The Google advertising system
charges advertisers for each time a user clicks on an ad that is displayed
next to related search engine results. Google developed programs to link
specific ads to millions of different search terms and to ensure that the
ads sold were priced fairly. The system provides the vast majority of
Google's billions of dollars in revenues.
Google's approach to
advertising is unlike the page-view model of its competitors. Google's
success depends on finding ways to produce results of such high quality that
users need not worry about clicking unnecessarily.
Google has had
other challenges. The Internet, as originally conceived, gave the same
priority to every piece of data that passed through the network. As the
Internet has developed, this principle of net neutrality has largely been
retained. In August 2010, Google executives claimed that they would continue
to support net neutrality on traditional cable and telephone services, but
they dropped their support for net neutrality for wireless devices.
Google's ad exchange lets advertisers target individual people and buy
access to them in real time as they surf the Web. In August 2010, Google
proposed to become a clearinghouse for everyone's data, too. Google would be
at the center of the trade in other people’s data.
Google's proposed
data clearinghouse would target ads more precisely by bringing together all
the private information that companies have gathered on users in one place.
These personally targeted ads will be intrusive and pervasive, allowing
advertisers to coordinate campaigns across a single user’s computer,
e-reader, and cell phone, as well as other devices with wireless
connections. An efficient data clearinghouse will enable marketers to update
these campaigns instantaneously.
As advertising becomes more
personalized and pervasive, it seems likely that more and more users will
want to opt out of the system. Google executives have considered allowing
users to pay Google the amount that advertisers would otherwise offer the
company to reach them, in exchange for receiving an ad-free service. The
next obvious step would be to provide well-off users with greater privacy,
at a price.
Google tracks information about users not just to target
advertisements but to provide better services. We have always traded a bit
of our privacy in order to receive better service. Google executives
habitually speak of privacy in terms of these kinds of trade-offs.
Regulators should impose a Chinese Wall between the private data that sites
need for personalized services from the private data that sites may use for
commercial purposes. A Chinese Wall would make it harder for sites to profit
online but it might also protect our privacy.
Googleplex
TechRepublic.com
April 2011
Edited by Andy Ross
In the Plex By Steven Levy
Over a two-year period, Levy got unprecedented access to people, places, and
meetings at the Google headquarters in Silicon Valley. His new book tells
all.
Early on, co-founders Larry Page and Sergei Brin listed all the
smartest and most influential people in computer science and then tried to
hire them all.
Once Page and Brin hired a bunch of smart people, they
asked them to turn Google into an artificial intelligence learning machine.
When Google created its AdWords and AdSense programs, it hired
statisticians and mathematicians to predict user behavior. This information
is a critical part of the auctions for various ads.
When the company
went public, Page and Brin told investors that sometimes they would forgo
profits to do the right thing for humanity.
When Google launched
Gmail, a lot of users freaked out about contextual ads because they thought
people were reading their mail. Google just used a search engine to scan the
messages.
Google dreams of "zero query search" where Google
anticipate what you want and gives it to you before you ask. This could be
based on location or on search history.
Page said he's surprised that
people aren't more ambitious because there are so many possibilities for
doing things that have never been done before.
Google calls its big,
ambitious projects moonshots.
Page and Brin continue to see Google
Books as something that Google is doing for the good of humanity.
Google lets its employees try lots of different projects on the principle
that if they aren't having enough failures then they aren't taking enough
risks.
At Google, the job of the lawyers is to figure out how to say
yes to the things that Page and Brin want to do.
Levy says that when
Google went into China, China changed Google more than Google changed China.
Levy: "Google is very worried about Facebook. It's going through a
Facebook panic right now."
How Google Dominates Us
By James Gleick The New York Review of Books, August 18, 2011
In the Plex: How Google Thinks, Works, and Shapes Our Lives By Steven
Levy Simon and Schuster, 424 pages
I'm Feeling Lucky: The
Confessions of Google Employee Number 59 By Douglas Edwards Houghton
Mifflin Harcourt, 416 pages
The Googlization of Everything (and Why
We Should Worry) By Siva Vaidhyanathan University of California
Press, 265 pages
Search & Destroy: Why You Can't Trust Google Inc.
By Scott Cleland with Ira Brodsky Telescope, 329 pages
Google is where we go for answers. Most of the time Google does not actually
have the answers. Google is the oracle of redirection. Google defines its
mission as to organize the world's information.
Google dominates the
information economy. Google has many secrets but the main ingredients of its
success have not been secret at all. Steven Levy has visited Google’s
headquarters periodically since 1999, talking with its founders, Larry Page
and Sergey Brin.
Google's single greatest innovation was the
algorithm called PageRank, developed by Page and Brin when they were
Stanford graduate students. The algorithm assigns every page a rank,
depending on how many other pages link to it. All links are not valued
equally. A recommendation is worth more when it comes from a page that has a
high rank itself. Page and Brin patented PageRank and published the details.
It is one of those ideas that seem obvious after the fact.
The Google
founders, Larry and Sergey, did everything their own way. Even in the
unbuttoned culture of Silicon Valley they stood out from the start as
originals. As they saw it, their mission encompassed not just the Internet
but all the world's books and images. Google Translate has achieved more in
machine translation than the rest of the world's artificial intelligence
experts combined.
Google owns and operates a constellation of giant
server farms spread around the globe — huge windowless structures,
resembling aircraft hangars or power plants, some with cooling towers. The
server farms stockpile the exabytes of information and operate an array of
staggeringly clever technology. This is Google's share of the cloud.
Google's business is advertising. Google makes more from advertising than
all the nation's newspapers combined. Doug Edwards interviewed for a job as
marketing manager in 1999. As Google employee number 59, he is the first
Google insider to have published his memoir.
The merchandise of the
information economy is attention. When information is cheap, attention
becomes expensive. Attention is what we give to Google, and our attention is
what Google sells.
Siva Vaidhyanathan: "We are not Google's
customers: we are its product. We — our fancies, fetishes, predilections,
and preferences — are what Google sells to advertisers."
The
evolution of this unparalleled money machine piled one brilliant innovation
atop another, in fast sequence:
1 Early in 2000, Google sold premium
sponsored links: simple text ads assigned to particular search terms. They
charged according to how many people saw each ad.
2 Late that year,
engineers devised an automated self-service system, dubbed AdWords. Suddenly
thousands of small businesses were buying their first Internet ads.
3 Google learned to charge per click rather than per view, and to let
advertisers bid for keywords against one another in fast online auctions.
Pay-per-click auctions opened a cash spigot.
4 Google had instant
knowledge of which ads were succeeding and which were not. It could view
click-through rates as a measure of ad quality. An effective ad would get
better placement. By 2003, AdWords Select was making so much money that
Google was deliberating hiding its success from the press and from
competitors.
5 Google expanded its platform outward. The aim was to
develop a form of artificial intelligence that could analyze chunks of text
— websites, blogs, e-mail, books — and match them with keywords. Given a
text, it could predict which advertisements would be effective.
Google called its program AdSense. For anyone hoping to monetize their
content, it was the Holy Grail. Anyone could now add a few lines of code to
their website, automatically display Google ads, and start cashing monthly
checks. Vast tracts of the Web that had been free of advertising now became
Google partners.
Search and advertising thus become the matched edges
of a sharp sword. The perfect search engine, as Sergey and Larry imagine it,
reads your mind and produces the answer you want. The perfect advertising
engine does the same: it shows you the ads you want. Anything else wastes
your attention.
Google began tracking the behavior of individual
users from one Internet site to the next. They observe our every click and
they measure in milliseconds how long it takes us to decide. If they didn't,
their results wouldn't be so uncannily effective. They have no rival in the
depth and breadth of their data mining.
The Google corporate motto is
"Don't be evil." The Googlers believed a corporation should behave
ethically. But when Google embarked on its program to digitize copyrighted
books and copy them onto its servers, it deceived publishers. Google knew
that the copying bordered on illegal but it considered its intentions
honorable and the law outmoded. Eric Schmidt: "Evil is what Sergey says is
evil."
Google did some evil in China. It collaborated in censorship.
Beginning in 2004, it arranged to tweak and twist its algorithms and filter
its results to omit results unwelcome to the government. Yet Google pushed
back against the government. When results were blocked, Google insisted on
alerting users with a notice at the bottom of the search page. The company
now serves China only from Hong Kong, with results censored not by Google
but by government filters.
Scott Cleland: "There is evidence that
Google is not all puppy dogs and rainbows." Google's corporate mascot is a
replica of a Tyrannosaurus Rex skeleton on display outside the corporate
headquarters. T. Rex was a terrifying predator.
Google's founders are
visionaries. Google's business competitors charge that the company
manipulates its search results to favor its friends and punish its enemies.
Google seems to be everywhere and seems to know everything and offends
against cherished notions of privacy.
The rise of social networking
upends the equation again. Users of Facebook choose to reveal aspects of
their private lives, at least to "friends." On Twitter, every remark can be
seen by the whole world. The Library of Congress is archiving all tweets.
Now Google is rolling out its social-networking platform Google+. Are the
social networks our friends?
AR February 2009: I guess Google will work in the perceived
public interest, either so as not to be evil or because the public authorities
demand it. In the latter case, the public interest will be American. We won't
have a globally effective legal framework for such issues for a while yet.
November 2009: The issue is big enough to take very seriously. We cannot merely hope
that Google will always do the right thing. I guess Darnton's "ambitious"
solution is the best — perhaps then we can hope that the European Union will
get on board and make the result a truly global repository.
February
2010: Publishers will need to do deals with Google and Amazon. That's no problem
— publishers have always done deals to secure their business. And Google will
have to grow up. That may be a bigger problem.
November 2010: The Chinese
Wall proposal seems like a good idea to me.
April 2011: Levy writes well
so the book may be good.
August 2011: I need to deploy AdSense on my
blog.


|