Google Vertaal

vrijdag 20 maart 2009

a common ground for search in real life and search using computers

permalink

cg2Googling has become a verb in our language. This shows the deep impact of Google on our culture and our lives. But Google is not primarily about searching. Google is an information shovel selling adds. In a previous article I intuitively described contextual search as finding information on the web not using Google. I was a little bit surprised about the interest for the story, because the idea of contextual search was still an embryonic idea. In this article I will develop this idea of contextual search further correlating to and in opposition to googling trying to find out what it is and what it is not. When looking for better information search strategies I want to compare our search behaviour using  CMC based systems like Google with natural communication. This is the starting point. This may sound odd and completely off the record, but in fact I'm only re-joining a tradition that has started in the sixties and seventies at the Biological Computer Lab in Urbana Campaing by Gordon Pask

Breaking the walls


The point is that computer science has far to long restricted its research in typical domains as logics, statistics and economics. This is understandable since the background disciplines of most computer scientists is of course beta science, but it has lead to a dangerous tendency of inbreeding. Though many of  scientists that were at the origin of computer science and cybernetics like Weizenbaum, and Wiener, were far more critical to their own domain and open minded towards  other scientific disciplines. They were aware of the social implications and warned for misuse and abuse.

In 1950, Wiener published The Human Use of Human Beings, which was widely read by general audiences. This book expressed Wiener's deep concern over the ethical consequences of the new technologies which science and cybernetics were making possible. It examines the nature of language and education as the means for a society to transmit cultural knowledge. It also examines the use of law, mass communication, secrecy and espionage by political regimes to enforce, regulate and protect their systems of power and control. He expresses a deep concern that the technologies of atomic weapons could not be kept from spreading to other countries because just knowing that they are possible is a sufficient incentive to motivate scientists to find the means of building it. And so he urges intellectuals and scientists to think carefully about the consequences of their work, and whether it will really improve the state of the world in the long-run.

An even more reflective book, God & Golem Inc. (1964) addresses the implications of cybernetics for the ethical society. Wiener takes the image of the golem  from Jewish mythology, which is a being made of clay and brought to life by a sorcerer, as a metaphor for the scientist who brings machines to life with cybernetics. He uses the metaphor to develop the idea that every age has its dogmatic beliefs, and there will be those who stand up to oppose them.

In 1966 Weizenbaum fooled those who believed in the almightiness of computers with his automatic therapy program ELIZA. He claimed that the shrink could simply be replaced by a computer program. Many believed the program could cure mental desease until Weizenbaum told to his surprised audience that ELIZA was fake. The second generation of cybernetic scientists, second order cybernetics or new cybernetics only added to this criticism. (see for a tribute to Weizenbaum Geert Lovink: The society of the query and the Googlization of our lives)

New cybernetics was quite aware of the shortcomings of Computer Mediated Communication and it left the narrowness of logics, mathematics and economics .Phisiology, biology, psychology, neurology, epistemology, belonged to their domain of discovery. In Urbana-Campaign at the University of Illinois the Biological Computer Lab of Heinz Van Foerster was inquiring the man-machine interaction. A range of brilliant scientists developed new cybernetics there from 1958 until 1974. The most important were: Heinz von Foerster (fysics, biofysics, epistemology), von Glaserfeld (epistemology, radical constructivism), Maturana and Varela (biologists, radical constrivism), Gordon Pask (psychologist, neurologist), Ashby (Cybernetics). Close to it, at  Palo Alto worked Watzlack and Bateson developping communication theory and double bind theory. Both teams were connected (Müller, 2000)

Today science is even tighter locked op in disciplines. This is a problem for a system engineer who wants to anticipate the social implications of his design. Jim Hendler notes in "Reinventing Academic Publishing - Part III":

"In science, I would argue that a similar effect is what causes a lot of the jargon issues to arise.  When we use a particular term from a particular field, we are usually in a context, be it a talk at a conference or a paper in a journal, which defines how that term is used.  In fact, one problem that scientists often face is that when we try to explain what we do to the general public, they don't have these contexts and the words we use revert to their more generic meanings - leading to misunderstandings and confusion.  Similar misunderstandings and confusions happen when scientists communicate across boundaries, and that is where much of the problem arises in interdisciplinary scientific discourse."

David Koepsel complaints in "Back to Basics: How Technology and the Open Source Movement Can Save Science":

"The relation of article publishing to career promotion in academic science has also promoted "salami science," in which a single scientific study may generate more papers than would practically be necessary to disseminate the results of a single study."

I propose that science and science application meet again. Therefore I want to join Gert Lovinks proposal of  "dismantling the academic exclusion machine", and leave the strict borders of computer science scout outside this domain in neurology, linguistics, sociology and political economics. Pask's creative ideas are still at the core of research  today.

On the road with Gordon Pask


The idea of Pask's conversation theory is that learning occurs through conversations which make the knowledge explicit. This is the main point I want to follow. t I want to consider information searching as a human learning activity. Pask's conversation Theory regards social systems as symbolic, language-oriented systems where responses depend on one person's interpretation of another person's behaviour, and where meanings are agreed through conversations. This vision is continued in the sociological work of Niklas Luhman and the evolutionary cybernetics of Francis Heylighen. Heylighen developed the idea of learnig Webs in his article 'Bootstrapping Knowledge representations: from entailments meshes via semantic nets to learning webs' (Heylighen Francis, 1997). This idea introduces the possibility of connecting the  namespaces of the semantic Web with  link-spaces. Heylighen continues doing research in this field. See his homepage or his publicaion list at Scientific Commons. Interdiscaplinarity was a main approach of Pask's Interactions of Actors Theory.

The progress made in the recent years in neuroscience confirms that the approach of the computer-men relation at the BCL was enlightening only computer science today doesn't seem to be aware of it. Behaviourist Stimulus-Response theories are still predominant in the majority of publications though they have proven to be false. In the 1973 Robert Rescorla refuted  the stimulus-response schema (S-R) , one of the dogma's of behaviourism[1]..Even the mice brain is not an input-output machine reacting on impulses, it's behaviour is anticipative. Some computer scientists that promote a system centred approach are still working with old fashioned ideas or are they just erring taking men for computers?  Anyway this is what happens when putting walls between scientific disciplines. Let's have a look what we can learn from actual neuroscience.

Since the discovery of mirror neurons in 1994 by Gallese, neuroscience is uncovering bit by bit the interaction of our brain in communication with other brains. Neurosociology and neuroeconomics are fast developing fields. They stress the importance of empathy and social interaction  in communication .

Conversation Theory and Pragmatics share important basic concepts on communication. Today pragmatics as well as conversation theory remain greatly ignored in Net applications. As to Pragmatics, and this is stressed in the Relevance Theory (Dan Sperber, Gloria Orriggi, 2004), the primary condition for success of the human communication system is overtness.

The code model of communication has the advantage of simplicity but if it has to cover human communication it suffers from some important inconsistency: How does a child learn to speak, learn the code, having no basic code available to start with? In engineering we call such a problem, the bootstrapping problem.  The inference model doesn't need a code to start with. As to Dan Sperber and Gloria Origgi intention is the key to understanding:

"After Grice, a second, wholly different mechanism was identified that also made communication possible: a communicator could communicate a content by giving evidence of her intention to do so, and the audience could infer this intention on the basis of this evidence. Of course, the two mechanisms, the coding-decoding, and the evidence-inference one, can combine in various ways. Today, most students of linguistic communication take for granted that normal comprehension involves a combination of decoding and of Gricean inference processes." (Sperber, Dan, Origgi, Gloria, 2000)

They point to a key aspect of Paul Grice's pragmatics is analysing language as speech. Pragmatics today refutes the unique role of codification and confirms that intentions play the leading role. Pragmatics shares this vision with Noam Chomsky. Noam Chomsky may be called the father of the anti-behaviourist paradigm in language analysis, criticising Skinner in 1959.

Tardes political economy of knowledge fits in Heylighens conceptual schema. See e.g. Trust in Communication between Individuals: A Connectionist Approach (2008). In recent analysis of Web 2.0 we found also indications that Adam Smith's and Marx's analysis of the market and commodities cannot be applied on information. One of the misleading concepts in Web 2.0 is the long tail, a marketing based concept (consumer demographics). It points to the context of Smiths free market as a driving force in the development of the World Wide Web. This is a cliché that easily can be undermined. Denis Hancock did put the long tail of Youtube in perspective:

"To me, that looks like a blockbuster model - and based on the viewing habits of people I know that go to YouTube, this makes sense (many simply check out whatever is most popular, which becomes a self-perpetuating cycle). But what do you think - am I missing something here, or is the long tail really not that important no YouTube?"

In their paper: Metcalfe's Law, Web 2.0, and the Semantic Web James Hendlerand Jennifer Golbeck explain the success of Youtube:

"Interestingly, email and blogging has proven to be one of the crucial aspects of the YouTube phenomenon.(...) Search in YouTube is rimarily enhanced by the social context, not by the "semantic content" of what is in the videos (Marcus, Perez, 2007). While automated technologies to create indexes of these videos are being sought, the primary indexing comes through the social overlay of the site." (Hendler, James and GolBeck, Jennifer, 2007, p. 3)

"...we argue that given the prevalence of the social constructs within these sites, that value of the network effect is coming from the links between people arising from the interactions using these sites." (Hendler, James and GolBeck, Jennifer, 2007, p. 4)

Therefore we prefer the political economy of knowledge of Tardes to be our guide where once again overtness plays a crucial role and where the market reductionism is disaproved. The complete commodification of knowledge and thereby of communication isn't possible because it is not a commodity in the traditional sense of Adam Smith's political economic theory. Making 'le savoir' a commodity is a reduction inflicted by those who want to earn money with knowledge and the desire to know, curiosity, a basic human drive. Google mixes these two concepts of knowledge: knowledge as a basic human quality acquired in experience, communication and learning and information as a commodity.

Gabriel Tarde noticed this market reduction more then 100 years ago in 'Psychologie économique'. Knowledge is a value in itself not needing a market to spread but an educating parent, a classroom teacher, a university professor, a librarian, a trainer, a friend. Tardes theory got lost in time but in the information age it's an eye opener. Instead of taking material production the famous needle factory of Adam Smith, as a starting point for his political economical analysis, he started with the analysis of 'la production de connaissances', 'des valeurs vérités' (truth values). Think about it as the production of a book, the production of a text, starting with the author having an idea to write about until the publication and acceptance (Lazzarato, Maurizio, 1999). The social context with strong collaboarative aspects is the domain for political economy of knowledge in Tardes view.

Well on the Web, the text you are reading isn't a commodity either, since it is published under the Creative Commons Licence. Anyway Google is going to use it to sell its clicks, to earn money with my work. Mixing the economic value of a book, text, with it's truth value, creates ambiguity resulting in giving up the truth value. This might not have been the original purpose of Google, but it's clear a result. So Tardes view is quite relevant for our information society and the way it treats knowledge.

An intuitive definition for a start


When looking to human search activities in the real world, the first thing that strikes me is that most of the time we do not search by trial and error, not randomly which is the foundation of the  of the Google Page Rank model, but intentionally. Based on our intention there is always an anticipation process, often also a conscious search strategy. For instance when looking for blackberries in a wood, we do not look in trees but search for thorny bushes because we know that blackberries do not grow on trees. The knowledge we used to anticipate our search is contextual. But our knowledge is not static, it is a learning process. The first example is a fairly simple case, looking for things. Looking for information, also knowledge, is a bit more complicated. At the side of the searcher there are 2 steps in which the second step is used as feedback for the first:

(1) searching information (2) learning to find information as a process

As a child, driven by curiosity, we start by asking questions to our parents. On that moment it is still clear that there is a communication process involved. When we learn that we can find information on other places, in libraries for example, we forget about the communication process, because we do not longer have a personal contact with the person who made the information we look for available. The main characteristics of human search activities can be conceived as (1) constructive (2) inter-active  (3) intentional. The contexts for successful human search activities are (4) collaborative and (5) overt. Overt: because this is a condition for relevant communication. These characteristics are interconnected. You cannot pick out one of them and leave the others out, my idea. Making a search environment just interactive, like Wikis Search isn't gone work, while the Wiki model, that resulted in the Wikipedia was successful because it was collaborative, constructive, interactive and intentional at the same time. This is also suggested by James Hendler and Jennifer Golbeck:

"What one sees when examining Wikipedia, and other successful sites, is the social construct being critical. As Jimmy Wales, developer of Wikipedia, stated in his (2005) talk at the Doors of Perception Conference, "Wikipedia is not primarily a technological innovation, but a social and design innovation." (Hendler, James and GolBeck, Jennifer, 2007, p. 3-4)

Searching is communicative, because is does not only rely on experience but also takes place in a social context. While this was obvious in our example an this is not a common view. But think: if nobody would make information available we would not find any. In addition if the information provider thinks about who or what his information is for, his activity, be it broadcasting, publishing, content providing, should be  coherent with communicating.  I use the word coherent because I do not believe it is a model for communication, it is rather the other way around.

Searching is using knowledge and building new knowledge at the same time. It is constructive, because it is always starting from the knowledge we have, we are not 'tabulae rasae'.  Active because we work with the knowledge we have.  Since basically we have learned to seek for information by communication, an intelligent information provider will try to facilitate our search process mirroring its own search activity.

The actual web publishing business throws all knowledge on a heap, reduces it to a homogeneous mass of unrelated data.  As a 'Deus ex Mchina' Google connected it again, not based on content, but following  links to pages and links to links without giving a damn about the content, without taking in account of these links do really express a relevant relation. I call this the bits and bytes approach. It  lets you find a needle in a haystack but what is needed is a bird's' eye view  that uncovers the relations between texts and data, that  converts the haystack in a map that can be explored. The semantic is at least a step in the good direction, but it's not enough. Why not rethink the whole mess looking at human search activities, that after all were rather successful.

I consider internet and the web as a mirror of human search activities. My proposal is to inquire how we can connect context space, link space and namespace on the internet as a mirror of human search activities in the real world.

I will expand my arguments in several  articles to come. The first is about the role of mirror neurons (neuroscience) and intention (pragmatics) in CMC as opposed to Google's business model.


[1] There are two competing theories of how classical conditioning works. The first, stimulus-response theory (S-R), suggests that an association to the unconditioned stimulus is made with the conditioned stimulus within the brain, but without involving conscious thought. The second theory stimulus-stimulus theory(S-S)  involves cognitive activity, in which the conditioned stimulus is associated to the concept of the unconditioned stimulus, a subtle but important distinction.

table style="border:0 none;width:30%;margin:2em 35%;padding:0;">



Creative Commons


some_rights_reserved


Geen opmerkingen: