Risto Miikkulainen (1994) Subsymbolic Natural Language Processing:. Psycoloquy: 5(46) Language Network (1)

Volume: 5 (next, prev) Issue: 46 (next, prev) Article: 1 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 5(46): Subsymbolic Natural Language Processing:

SUBSYMBOLIC NATURAL LANGUAGE PROCESSING:
AN INTEGRATED MODEL OF SCRIPTS, LEXICON, AND MEMORY
[Cambridge, MA: MIT Press, 1993 15 chapters, 403 Pages]
Precis of Miikkulainen on Language-Network

Risto Miikkulainen
Department of Computer Sciences
The University of Texas at Austin
Austin, TX 78712

risto@cs.utexas.edu

Abstract

Distributed neural networks have been very successful in modeling isolated cognitive phenomena, but complex high-level behavior has been amenable only to symbolic artificial intelligence techniques. Aiming to bridge this gap, this book describes DISCERN, a complete natural language processing system implemented entirely at the subsymbolic level. In DISCERN, distributed neural network models of parsing, generating, reasoning, lexical processing and episodic memory are integrated into a single system that learns to read, paraphrase, and answer questions about stereotypical narratives. Using DISCERN as an example, a general approach to building high-level cognitive models from distributed neural networks is introduced, and the special properties of such networks are shown to provide insight into human performance. In this approach, connectionist networks are not only plausible models of isolated cognitive phenomena, but also sufficient constituents for generating complex, high-level behavior.

Keywords

computational modeling, connectionism, distributed neural networks, episodic memory, lexicon, natural language processing, scripts.

I. MOTIVATION

1. Recently there has been a great deal of excitement in cognitive science about the subsymbolic (i.e., parallel distributed processing, or distributed connectionist, or distributed neural network) approach to natural language processing. Subsymbolic systems seem to capture a number of intriguing properties of human-like information processing such as learning from examples, context sensitivity, generalization, robustness of behavior, and intuitive reasoning. These properties have been very difficult to model with traditional, symbolic techniques.

2. Within this new paradigm, the central issues are quite different from (even incompatible with) the traditional issues in symbolic cognitive science, and the research has proceeded without much in common with the past. However, the ultimate goal is still the same: to understand how human cognition is put together. Even if cognitive science is being built on a new foundation, as can be argued, many of the results obtained through symbolic research are still valid, and could be used as a guide for developing subsymbolic models of cognitive processes.

3. This is where DISCERN, the computer-simulated neural network model described in this book (Miikkulainen 1993), fits in. DISCERN is a purely subsymbolic model, but at the high level it consists of modules and information structures similar to those of symbolic systems, such as scripts, lexicon, and episodic memory. At the highest level of cognitive modeling, the symbolic and subsymbolic paradigms have to address the same basic issues. Outlining a parallel distributed approach to those issues is the purpose of DISCERN.

4. In more specific terms, DISCERN aims: (1) to demonstrate that distributed artificial neural networks can be used to build a large-scale natural language processing system that performs approximately at the level of symbolic models; (2) to show that several cognitive phenomena can be explained at the subsymbolic level using the special properties of these networks; and (3) to identify central issues in subsymbolic cognitive modeling and to develop well-motivated techniques to deal with them. To the extent that DISCERN is successful in these areas, it constitutes a first step towards subsymbolic natural language processing.

II. THE SCRIPT PROCESSING TASK

5. Scripts (Schank and Abelson, 1977) are schemas of often-encountered, stereotypic event sequences, such as visiting a restaurant, traveling by airplane, and shopping at a supermarket. Each script divides further into tracks, or established minor variations. A script can be represented as a causal chain of events with a number of open roles. Script-based understanding means reading a script-based story, identifying the proper script and track, and filling its roles with the constituents of the story. Events and role fillers that were not mentioned in the story but are part of the script can then be inferred. Understanding is demonstrated by generating an expanded paraphrase of the original story, and by answering questions about the story.

6. To see what is involved in the task, let us consider an example of DISCERN input/output behavior. The following input stories are examples of the fancy-restaurant, plane-travel, and electronics-shopping tracks:

    John went to MaMaison. John asked the waiter for lobster. John left
    the waiter a big tip.

    John went to LAX. John checked in for a flight to JFK. The plane
    landed at JFK.

    John went to Radio-Shack. John asked the staff questions
    about CD-players. John chose the best CD-player.

7. DISCERN reads the orthographic word symbols sequentially, one at a time. An internal representation of each story is formed, where all inferences are made explicit. These representations are stored in the episodic memory. The system then answers questions about the stories:

    What did John buy at Radio-Shack?
      John bought a CD-player at Radio-Shack.

    Where did John fly to?
      John flew to JFK.

    What did John eat at MaMaison?
      John ate a good lobster.

With the question as a cue, the appropriate story representation is retrieved from the episodic memory and the answer is generated word by word. DISCERN also generates full paraphrases of the input stories. For example, it generates an expanded version of the restaurant story:

    John went to MaMaison. The waiter seated John. John asked the
    waiter for lobster. John ate a good lobster. John paid the waiter.
    John left a big tip. John left MaMaison.

8. The answers and the paraphrase show that DISCERN has made a number of inferences beyond the original story. For example, it inferred that John ate the lobster and the lobster tasted good. The inferences are not based on specific rules but are statistical and learned from experience. DISCERN has read a number of similar stories in the past and the unmentioned events and role bindings have occurred in most cases. They are assumed immediately and automatically upon reading the story and have become part of the memory of the story. In a similar fashion, human readers often confuse what was mentioned in the story with what was only inferred (Bower et al., 1979; Graesser et al., 1979).

9. A number of issues can be identified from the above examples. Specifically, DISCERN has to (1) make statistical, script-based inferences and account for learning them from experience; (2) store items in the episodic memory in a single presentation and retrieve them with a partial cue; (3) develop a meaningful organization for the episodic memory, based on the stories it reads; (4) represent meanings of words, sentences, and stories internally; and (5) organize a lexicon of symbol and concept representations based on examples of how words are used in the language and form a many-to-many mapping between them. Script processing constitutes a good framework for studying these issues, and a good domain for developing an approach towards the goals outlined above.

III. APPROACH

10. Parallel distributed processing models typically have very little internal structure. They produce the statistically most likely answer given the input conditions in a process that is opaque to the external observer. This is well suited to the modeling of isolated low-level tasks, such as learning past tense forms of verbs (Rumelhart and McClelland, 1986) or word pronunciation (Sejnowski and Rosenberg, 1987). Given the success of such models, a possible approach to higher-level cognitive modeling would be to construct the system from several submodules that work together to produce the higher-level behavior.

11. In DISCERN, the immediate goal is to build a complete, integrated system that performs well in the script processing task. In this sense, DISCERN is very similar to traditional models in artificial intelligence. However, DISCERN also aims to show how certain parts of human cognition could actually be built. The components of DISCERN were designed as independent cognitive models that can account for interesting language processing and memory phenomena, many of which are not even required in the DISCERN task. Combining these models into a single, working system is one way of validating them. In DISCERN, the components are not just models of isolated cognitive phenomena; they are sufficient constituents for generating complex high-level behavior.

IV. THE DISCERN MODEL

12. DISCERN can be divided into parsing, generating, question answering, and memory subsystems, each with two modules (figure 1). Each module is trained in its task separately and in parallel. During performance, the modules form a network of networks, each feeding its output to the input of another module.

                       Input text     Output text
                              |         
                              V         |
  =================        =================        =================
   Sentence Parser <-------     Lexicon     <------- Sentence Gener.
  =================        =================        =================
     |         |                                                
     |         |                                           |     |
     |         +-------+-----------------------+           |     |
     |                 |                       |           |     |
     |                 V                       V           |     |
     |         =================       =================   |     |
     |             Cue Former           Answer Producer ---+     |
     |         =================       =================         |
     |                       |                                  |
     |                       |           |                       |
     V                       V           |                       |
  =================        =================        =================
    Story Parser   -------> Episodic Memory -------> Story Generator
  =================        =================        =================

                      Figure 1: The DISCERN Model.

13. The sentence parser reads the input words one at a time and forms a representation of each sentence. The story parser combines the sequence of sentences into an internal representation of the story, which is then stored in the episodic memory. The story generator receives the internal representation and generates the sentences of the paraphrase one at a time. The sentence generator outputs the sequence of words for each sentence. The cue former receives a question representation, built by the sentence parser, and forms a cue pattern for the episodic memory, which returns the appropriate story representation. The answer producer receives the question and the story and generates an answer representation, which is output word by word by the sentence generator. The architecture and behavior of each of these modules in isolation is outlined below.

V. LEXICON

14. The input and output of DISCERN consist of distributed representations for orthographic word symbols (also called lexical words). Internally, DISCERN processes semantic concept representations (semantic words). Both the lexical and semantic words are represented distributively as vectors of gray-scale values between 0.0 and 1.0. The lexical representations are based on the visual patterns of characters that make up the written word; they remain fixed throughout the training and performance of DISCERN. The semantic representations stand for distinct meanings and are developed automatically by the system while it is learning the processing task.

15. The lexicon stores the lexical and semantic representations and translates between them. It is implemented as two feature maps (Kohonen, 1989), one lexical and the other semantic. Words whose lexical forms are similar, such as "LINE" and "LIKE", are represented by nearby units in the lexical map. In the semantic map, words with similar semantic content, such as "John" and "Mary", or "Leone's" and "MaMaison" are mapped near each other. There is a dense set of associative interconnections between the two maps. A localized activity pattern representing a word in one map will cause a localized activity pattern to form in the other map, representing the same word. The output representation is then obtained from the weight vector of the most highly active unit. The lexicon thus transforms a lexical input vector into a semantic output vector and vice versa. Both maps and the associative connections between them are organized simultaneously, based on examples of co-occurring symbols and meanings.

16. The lexicon architecture facilitates interesting behavior. Localized damage to the semantic map results in category-specific lexical deficits similar to human aphasia (Caramazza, 1988; McCarthy and Warrington, 1990). For example, the system selectively loses access to restaurant names, or animate words, when that part of the map is damaged. Dyslexic performance errors can also be modeled. If the performance is degraded, for example, by adding noise to the connections, parsing and generation errors that occur are quite similar to those observed in human deep dyslexia (Coltheart et al., 1988). For example, the system may confuse "Leone's" with "MaMaison", or "LINE" with "LIKE", because they are nearby in the map and share similar associative connections.

VI. FGREP PROCESSING MODULES

17. Processing in DISCERN is carried out by hierarchically organized pattern-transformation networks. Each module performs a specific subtask, such as parsing a sentence or generating an answer to a question. All these networks have the same basic architecture: they are three-layer, simple-recurrent backpropagation networks (Elman, 1990), with the extension called FGREP that allows them to develop distributed representations for their input/output words.

18. The network learns the processing task by adapting the connection weights according to the standard on-line backpropagation procedure (Rumelhart et al., 1986, pp. 327-329). The error signal is propagated to the input layer, and the current input representations are modified as if they were an extra layer of weights. The modified representation vectors are put back in the lexicon, replacing the old representations. Next time the same words occur in the input or output, their new representations are used to form the input/output patterns for the network. In FGREP, therefore, the required mappings change as the representations evolve, and backpropagation is shooting at a moving target.

19. The representations that result from this process have a number of useful properties for cognitive modeling. (1) Since they adapt to the error signal, they end up coding information most crucial to the task. Representations for words that are used in similar ways in the examples become similar. Thus, these profiles of continuous activity values can be claimed to code the meanings of the words as well. (2) As a result, the system never has to process very novel input patterns, because generalization has already been done in the representations. (3) The representation of a word is determined by all the contexts in which that word has been encountered; consequently, it is also a representation of all those contexts. Expectations emerge automatically and cumulatively from the input word representations. (4) Single representation components do not usually stand for identifiable semantic features. Instead, the representation is holographic: word categories can often be recovered from the values of single components. (5) Holography makes the system very robust against noise and damage. Performance degrades approximately linearly as representation components become defective or inaccurate.

VII. EPISODIC MEMORY

20. The episodic memory in DISCERN consists of a hierarchical pyramid of feature maps organized according to the taxonomy of script-based stories. The highest level of the hierarchy is a single feature map that lays out the different script classes. Beneath each unit of this map there is another feature map that lays out the tracks within the particular script. The different role bindings within each track are separated at the bottom level. The map hierarchy receives a story representation vector as its input and classifies it as an instance of a particular script, track, and role binding. The hierarchy thereby provides a unique memory representation for each script-based story as the maximally responding units in the feature maps at the three levels.

21. Whereas the top and the middle level in the hierarchy only serve as classifiers, selecting the appropriate track and role-binding map for each input, at the bottom level a permanent trace of the story must also be created. The role-binding maps are trace feature maps, with modifiable lateral connections. When the story representation vector is presented to a role-binding map, a localized activity pattern forms as a response. Each lateral connection to a unit with higher activity is made excitatory, while a connection to a unit with lower activity is made inhibitory. The units within the response now "point" towards the unit with highest activity, permanently encoding that the story was mapped at that location.

22. A story is retrieved from the episodic memory by giving it a partial story representation as a cue. Unless the cue is highly deficient, the map hierarchy is able to recognize it as an instance of the correct script and track and form a partial cue for the role-binding map. The trace feature map mechanism then completes the role binding. The initial response of the map is again a localized activity pattern; because the map is topological, it is likely to be located somewhere near the stored trace. If the cue is close enough, the lateral connections pull the activity to the center of the stored trace. The complete story representation is retrieved from the weight vectors of the maximally responding units at the script, track, and role-binding levels.

23. Hierarchical feature maps have a number of properties that make them useful for memory organization: (1) The organization is formed in an unsupervised manner, extracting it from the input experience of the system. (2) The resulting order reflects the properties of the data, the hierarchy corresponding to the levels of variation, and the maps laying out the similarities at each level. (3) By dividing the data first into major categories and gradually making finer distinctions lower in the hierarchy, the most salient components of the input data are singled out and more resources are allocated for representing them accurately. (4) Because the representation is based on salient differences in the data, the classification is very robust, and usually correct even if the input is noisy or incomplete. (5) Because the memory is based on classifying the similarities and storing the differences, retrieval becomes a reconstructive process (Kolodner, 1984; Williams and Hollan, 1981) similar to human memory.

24. The trace feature map exhibits interesting memory effects that result from interactions between traces. Later traces capture units from earlier ones, making later traces more likely to be retrieved. The extent of the traces determines memory capacity. The smaller the traces, the more of them will fit in the map, but more accurate cues are required to retrieve them. If the memory capacity is exceeded, older traces will be selectively replaced by newer ones. Traces that are unique, that is, located in a sparse area of the map, are not affected, no matter how old they are. Similar effects are common in human long-term memory (Baddeley, 1976; Postman, 1971).

VIII. DISCERN HIGH-LEVEL BEHAVIOR

25. DISCERN is more than just a collection of individual cognitive models. Interesting behavior results from the interaction of the components in a complete story-processing system.

26. DISCERN was trained and tested with an artificially generated corpus of script-based stories consisting of three scripts with three tracks and three open roles each. The complete DISCERN system performs very well: at the output, about 98 percent of the words are correct. This is rather remarkable for a chain of networks that is 9 modules long and consists of several different types of modules.

27. A modular neural network system can only operate if it is stable, that is, if small deviations from the normal flow of information are automatically corrected. It turns out that DISCERN has several built-in safeguards against minor inaccuracies and noise. The semantic representations are distributed and redundant, and inaccuracies in the output of one module are cleaned up by the module that uses the output. The memory modules clean up by categorical processing: a noisy input is recognized as a representative of an established class and replaced by the correct representation of that class. As a result, small deviations do not throw the system off course, but rather the system filters out the errors and returns to the normal course of processing, which is an essential requirement for building robust cognitive models.

28. DISCERN also demonstrates strong script-based inferencing. Even when the input story is incomplete, consisting of only a few main events, DISCERN can usually form an accurate internal representation of it. DISCERN was trained to form complete story representations from the first sentence on, and because the stories are stereotypical, missing sentences have little effect on the parsing process. Once the story representation has been formed, DISCERN performs as if the script had been fully instantiated. Questions about missing events and role-bindings are answered as if they were part of the original story. If events occurred in an unusual order, they are recalled in the stereotypical order in the paraphrase. If there is not enough information to fill a role, the most likely filler is selected and maintained throughout the paraphrase generation. Such behavior automatically results from the modular architecture of DISCERN and is consistent with experimental observations on how people remember stories of familiar event sequences (Bower et al., 1979; Graesser et al., 1979).

29. In general, given the information in the question, DISCERN recalls the story that best matches it in the memory. An interesting issue is: what happens when DISCERN is asked a question that is inaccurate or ambiguous, that is, one that does not uniquely specify a story? For example, DISCERN might have read a story about John eating lobster at MaMaison, and then about Mary doing the same at Leone's, and the question could be "Who ate lobster?" Because later traces are more prominent in the memory, DISCERN is more likely to retrieve the Mary-at-Leone's story in this case. The earlier story is still in the memory, but to recall it, more details need to be specified in the question, such as `Who ate lobster at MaMaison?" Similarly, DISCERN can robustly retrieve a story even if the question is slightly inaccurate. When asked "How did John like the steak at MaMaison?", DISCERN generates the answer "John thought lobster was good at MaMaison", ignoring the inaccuracy in the question, because the cue is still close enough to the stored trace. DISCERN does recognize, though, when a question is too different from anything in the memory, and should not be answered. For "Who ate at McDonald's?", the cue vector is not close to any trace, the memory does not settle, and nothing is retrieved. Note that these mechanisms were not explicitly built into DISCERN, but they emerge automatically from the physical layout of the architecture and representations.

IX. DISCUSSION

30. There is an important distinction between scripts (or more generally, schemas) in symbolic systems, and scripts in subsymbolic models such as DISCERN. In the symbolic approach, a script is stored in memory as a separate, exact knowledge structure, coded by the knowledge engineer. The script has to be instantiated by searching the schema memory sequentially for a structure that matches the input. After instantiation, the script is active in the memory and later inputs are interpreted primarily in terms of this script. Deviations are easy to recognize and can be taken care of with special mechanisms.

31. In the subsymbolic approach, schemas are based on statistical properties of the training examples, extracted automatically during training. The resulting knowledge structures do not have explicit representations. For example, a script exists in a neural network only as statistical correlations coded in the weights. Every input is automatically matched to every correlation in parallel. There is no all-or-none instantiation of a particular knowledge structure. The strongest, most probable correlations will dominate, depending on how well they match the input, but all of them are simultaneously active at all times. Regularities that make up scripts can be particularly well captured by such correlations, making script-based inference a good domain for the subsymbolic approach. Generalization and graceful degradation give rise to inferencing that is intuitive, immediate, and occurs without conscious control, as is script-based inference in humans. On the other hand, it is very difficult to recognize deviations from the script and to initiate exception-processing when the automatic mechanisms fail. Such sequential reasoning would require intervention of a high-level "conscious" monitor, which has yet to be built in the connectionist framework.

X. CONCLUSION

32. The main conclusion from DISCERN is that building subsymbolic models is a feasible approach to understanding mechanisms underlying natural language processing. DISCERN shows how several cognitive phenomena may result from subsymbolic mechanisms. Learning word meanings, script processing, and episodic memory organization are based on self-organization and gradient-descent in error in this model. Script-based inferences, expectations, and defaults automatically result from generalization and graceful degradation. Several types of performance errors in role binding, episodic memory, and lexical access emerge from the physical organization of the system. Perhaps most significantly, DISCERN shows how individual connectionist models can be combined into a large, integrated system that demonstrates that these models are sufficient constituents for generating sequential, symbolic, high-level behavior.

33. Although processing simple script instantiations is a start, there is a long way to go before subsymbolic models will rival the best symbolic cognitive models. For example, in story understanding, symbolic systems have been developed that analyze realistic stories in depth, based on higher-level knowledge structures such as goals, plans, themes, affects, beliefs, argument structures, plots, and morals. In designing subsymbolic models that would do that, we are faced with two major challenges: (1) how to implement connectionist control of high-level processing strategies (making it possible to model processes more sophisticated than a series of reflex responses), and (2) how to represent and learn abstractions (making it possible to process information at a higher level than correlations in the raw input data). Progress in these areas would constitute a major step towards extending the capabilities of subsymbolic natural language processing models beyond those of DISCERN.

XI. NOTE

34. Software for the DISCERN system is available through anonymous ftp from cs.utexas.edu:pub/neural-nets/discern. An X11 graphics demo, showing DISCERN in processing the example stories discussed in the book, can be run remotely under the World Wide Web at http://www.cs.utexas.edu/~risto/discern.html, or by telnet with "telnet cascais.utexas.edu 30000".

XII. TABLE OF CONTENTS

PART I Overview 1 Introduction 2 Background 3 Overview of DISCERN

PART II Processing Mechanisms 4 Backpropagation Networks 5 Developing Representations in FGREP Modules 6 Building from FGREP Modules

PART III Memory Mechanisms 7 Self-Organizing Feature Maps 8 Episodic Memory Organization: Hierarchical Feature Maps 9 Episodic Memory Storage and Retrieval: Trace Feature Maps 10 Lexicon

PART IV Evaluation 11 Behavior of the Complete Model 12 Discussion 13 Comparison to Related Work 14 Extensions and Future Work 15 Conclusions

APPENDICES A Story Data B Implementation Details C Instructions for Obtaining the DISCERN Software

XIII. REFERENCES

Baddeley, A.D. (1976) The Psychology of Memory. New York: Basic Books.

Bower, G.H., Black, J.B. and Turner, T.J. (1979) Scripts in memory for text. Cognitive Psychology, 11:177-220.

Caramazza, A. (1988) Some aspects of language processing revealed through the analysis of acquired aphasia: The lexical system. Annual Review of Neuroscience, 11:395-421.

Coltheart, M., Patterson, K. and Marshall, J.C., editors (1988) Deep Dyslexia. London; Boston: Routledge and Kegan Paul. Second edition.

Elman, J.L. (1990) Finding structure in time. Cognitive Science, 14:179-211.

Graesser, A.C., Gordon, S.E. and Sawyer, J.D. (1979) Recognition memory for typical and atypical actions in scripted activities: Tests for the script pointer+tag hypothesis. Journal of Verbal Learning and Verbal Behavior, 18:319-332.

Kohonen, T. (1989) Self-Organization and Associative Memory. Berlin; Heidelberg; New York: Springer. Third edition.

Kolodner, J.L. (1984) Retrieval and Organizational Strategies in Conceptual Memory: A Computer Model. Hillsdale, NJ: Erlbaum.

Miikkulainen, R. (1993) Subsymbolic Natural Language Processing: an Integrated Model of Scripts, Lexicon, and Memory. Cambridge MA: MIT.

McCarthy, R.A. and Warrington, E.K. (1990) Cognitive Neuropsychology: A Clinical Introduction. New York: Academic Press.

Postman, L. (1971) Transfer, interference and forgetting. In Kling, J.W., and Riggs, L.A., editors, Woodworth and Schlosberg's Experimental Psychology, 1019-1132. New York: Holt, Rinehart and Winston. Third edition.

Rumelhart, D.E. and McClelland, J.L. (1986) On learning past tenses of English verbs. In Rumelhart, D.E., and McClelland, J.L., editors, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2, 216--271. Cambridge, MA: MIT Press.

Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986) Learning internal representations by error propagation. In Rumelhart, D.E. and McClelland, J.L., editors, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1, 318-362. Cambridge, MA: MIT Press.

Sejnowski, T. J., and Rosenberg, C. R. (1987) Parallel networks that learn to pronounce English text. Complex Systems, 1:145--168.

Schank, R.C. and Abelson, R.P. (1977) Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale, NJ: Erlbaum.

Williams, M.D. and Hollan, J.D. (1981) The process of retrieval from very long-term memory. Cognitive Science, 5:87--119.

--------------------------------------------------------------------

        PSYCOLOQUY Book Review Instructions

The PSYCOLOQUY book review procedure is very similar to the commentary procedure except that it is the book itself, not a target article, that is under review. (The Precis summarizing the book is intended to permit PSYCOLOQUY readers who have not read the book to assess the exchange, but the reviews should address the book, not primarily the Precis.)

Note that as multiple reviews will be co-appearing, you need only comment on the aspects of the book relevant to your own specialty and interests, not necessarily the book in its entirety. Any substantive comments and criticism -- including points calling for a detailed and substantive response from the author -- are appropriate. Hence, investigators who have already reviewed or intend to review this book elsewhere are still encouraged to submit a PSYCOLOQUY review specifically written with this specialized multilateral review-and-response feature in mind.

1. Before preparing your review, please read carefully

    the Instructions for Authors and Commentators and examine
    recent numbers of PSYCOLOQUY.

2. Reviews should not exceed 500 lines. Where judged necessary

    by the Editor, reviews will be formally refereed.

3. Please provide a title for your review. As many

    commentators will address the same general topic, your
    title should be a distinctive one that reflects the gist
    of your specific contribution and is suitable for the
    kind of keyword indexing used in modern bibliographic
    retrieval systems. Each review should also have a brief
    (~50-60 word) Abstract

4. All paragraphs should be numbered consecutively. Line length

    should not exceed 72 characters.  The review should begin with
    the title, your name and full institutional address (including zip
    code) and email address.  References must be prepared in accordance
    with the examples given in the Instructions.  Please read the
    sections of the Instruction for Authors concerning style,

    INSTRUCTIONS FOR PSYCOLOQUY AUTHORS AND COMMENTATORS

PSYCOLOQUY is a refereed electronic journal (ISSN 1055-0143) sponsored on an experimental basis by the American Psychological Association and currently estimated to reach a readership of 40,000. PSYCOLOQUY publishes brief reports of new ideas and findings on which the author wishes to solicit rapid peer feedback, international and interdisciplinary ("Scholarly Skywriting"), in all areas of psychology and its related fields (biobehavioral science, cognitive science, neuroscience, social science, etc.). All contributions are refereed.

Target article length should normally not exceed 500 lines [c. 4500 words]. Commentaries and responses should not exceed 200 lines [c. 1800 words].

All target articles, commentaries and responses must have (1) a short abstract (up to 100 words for target articles, shorter for commentaries and responses), (2) an indexable title, (3) the authors' full name(s) and institutional address(es).

In addition, for target articles only: (4) 6-8 indexable keywords, (5) a separate statement of the authors' rationale for soliciting commentary (e.g., why would commentary be useful and of interest to the field? what kind of commentary do you expect to elicit?) and (6) a list of potential commentators (with their email addresses).

All paragraphs should be numbered in articles, commentaries and responses (see format of already published articles in the PSYCOLOQUY archive; line length should be < 80 characters, no hyphenation).

It is strongly recommended that all figures be designed so as to be screen-readable ascii. If this is not possible, the provisional solution is the less desirable hybrid one of submitting them as postscript files (or in some other universally available format) to be printed out locally by readers to supplement the screen-readable text of the article.

PSYCOLOQUY also publishes multiple reviews of books in any of the above fields; these should normally be the same length as commentaries, but longer reviews will be considered as well. Book authors should submit a 500-line self-contained Precis of their book, in the format of a target article; if accepted, this will be published in PSYCOLOQUY together with a formal Call for Reviews (of the book, not the Precis). The author's publisher must agree in advance to furnish review copies to the reviewers selected.

Authors of accepted manuscripts assign to PSYCOLOQUY the right to publish and distribute their text electronically and to archive and make it permanently retrievable electronically, but they retain the copyright, and after it has appeared in PSYCOLOQUY authors may republish their text in any way they wish -- electronic or print -- as long as they clearly acknowledge PSYCOLOQUY as its original locus of publication. However, except in very special cases, agreed upon in advance, contributions that have already been published or are being considered for publication elsewhere are not eligible to be considered for publication in PSYCOLOQUY,

Please submit all material to psyc@pucc.bitnet or psyc@pucc.princeton.edu Anonymous ftp archive is DIRECTORY pub/harnad/Psycoloquy HOST princeton.edu

Volume: 5 (next, prev) Issue: 46 (next, prev) Article: 1 (next prev first) Alternate versions: ASCII Summary